-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add process.name attribute and adapt process.executable.name #1737
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
model/process/registry.yaml
Outdated
stability: experimental | ||
brief: > | ||
The name of the process. On Linux based systems, can be set | ||
to the `Name` in `proc/[pid]/status`. On Windows, can be set to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to the `Name` in `proc/[pid]/status`. On Windows, can be set to the | |
to the value in `/proc/[pid]/comm` or to the (equivalent) | |
`Name` in `/proc/[pid]/status`. On Windows, can be set to the |
We can also use /proc/[pid]/comm
which requires no parsing, unlike extracting Name
out of /proc/[pid]/status
. The values should be equivalent.
Also see:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about suggesting both since they're the same anyway? For example, in the hostmetricsreceiver we'll already have parsed /proc/[pid]/status
for other information, so just getting the name from that is fine in our case, but in other cases people might prefer to just get /proc/[pid]/comm
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5a679fd
.chloggen/process_name.yaml
Outdated
# Use pipe (|) for multiline entries. | ||
subtext: | | ||
The new `process.name` attribute uses the original guidance for `process.executable.name`, | ||
suggesting use of the `Name` field from `/proc/[pid]/status` on Linux. `process.executable.name` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggesting use of the `Name` field from `/proc/[pid]/status` on Linux. `process.executable.name` | |
suggesting use of `/proc/[pid]/comm` or the equivalent `Name` field | |
from `/proc/[pid]/status` on Linux. `process.executable.name` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5a679fd
to the `Name` in `proc/[pid]/status`. On Windows, can be set to the | ||
base name of `GetProcessImageFileNameW`. | ||
to the base name of the target of `/proc/[pid]/exe`. On Windows, | ||
can be set to the base name of `GetProcessImageFileNameW`. | ||
examples: ['otelcol'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to have an example that's different between process.name
and process.executable.name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would be the best place for that? As part of this description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added it as a note on process.name
in cd4c335
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant if we could have realistic example that's different on linux in examples: ['otelcol']
, it's minor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to have an example that's different between process.name and process.executable.name ?
For Linux, one example where the process name isn't changed programmatically is using a symbolic link, for example ln -s foo bar
. Running ./bar
gives you bar
from /proc/<PID>/comm
, but the basename foo
from readlink /proc/<PID>/exe
.
A real example for this is unlz4
which is a link to the lz4
executable (on my Debian system). So as you say in your suggestion, process.name
seems to be more relevant/precise here.
Unfortunate, there are other examples where process.name
needs to be combined with process.executable.path
:
UnicodeNameMappingGenerator-16 -> ../lib/llvm-16/bin/UnicodeNameMappingGenerator
UnicodeNameMappingGenerator-17 -> ../lib/llvm-17/bin/UnicodeNameMappingGenerator
process.name
: UnicodeNameMap (in both cases)
process.executable.name
: UnicodeNameMappingGenerator (in both cases)
model/process/registry.yaml
Outdated
to the `Name` in `proc/[pid]/status`. On Windows, can be set to the | ||
base name of `GetProcessImageFileNameW`. | ||
to the base name of the target of `/proc/[pid]/exe`. On Windows, | ||
can be set to the base name of `GetProcessImageFileNameW`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems process.name
and process.executable.name
are always the same on windows - is it the case?
In the spirit of T-shaped API, do you think it could be one of these linux-specific things? I.e. process.linux.exe|exectuable.name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this specification, it is the case that process.name
and process.executable.name
will be the same on Windows. I dug as far as I could into Win32 and .NET APIs to see if there was ever a way to change a process's name at runtime like in Linux, and I could not find any way to do so. So I think it is fundamentally the case even outside of this specification that process name and executable name will always be the same on Windows.
I think there is a good use-case for maintaining process.executable.name
as a cross-platform name, and that's how it's used in CLI. In this case, it seems the way it's used is for the attribute to be the executable name on both Linux and Windows. On Linux this distinction matters, whereas on Windows it doesn't. However, this ensures that on either platform the cli.internal
span will always have a correct executable name, and doesn't need to worry about special attributes based on the platform.
Perhaps I could add a note in process.executable.name
's description that on Windows it will always be the same, and can be excluded if you're already using process.name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can generalize this a bit to be OS-agnostic?
- OTEL is not only about Linux and Windows
- missing features in Windows or other OSes may appear with the next version/upgrade
So, can we just say that the two fields may have the same value?
That is often true on Linux and always true for current Windows versions and below.
It still makes sense for Windows clients to send both fields, otherwise there should be a written hint/rule like "if process.name
isn't set, fallback to process.executable.name
" or the other way round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use process.executable.name
in CLI conventions because it was the only one that existed.
The question I feel we should address:
If I have some instrumentation that would benefit from having a process name, but does not need a lot of details ("General Class") - which one of those attributes I should use? I'm going to say process.name
is the first candidate just because it's shorter and looks very general.
In this sense, I would even suggest to change process.name
definition to be the best known name (yes, you can get it using OS APIs, but maybe you have a smart way to generate better process names, or maybe you want to record friendly process name when self-reporting it from within a process).
Let me try to phrase it (will leave a separate comment with suggestion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <a id="process-working-directory" href="#process-working-directory">`process.working_directory`</a> | string | The working directory of the process. | `/root` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
|
||
**[1] `process.args_count`:** This field can be useful for querying or performing bucket analysis on how many arguments were provided to start a process. More arguments may be an indication of suspicious activity. | ||
|
||
**[2] `process.title`:** In many Unix-like systems, process title (proctitle), is the string that represents the name or command line of a running process, displayed by system monitoring tools like ps, top, and htop. | ||
**[2] `process.name`:** The value of this attribute will be equivalent to `process.executable.name` on Windows, but may not be on Linux. On Linux, the process name from `/proc/[pid]/comm` is truncated if its name is longer than `TASK_COMM_LEN`-1, and it can be manually changed by the process itself via [`prctl(2)`](https://man7.org/linux/man-pages/man2/prctl.2.html). On Windows, it won't be necessary to have both `process.name` and `process.executable.name`, but it may be on Linux depending on your use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should a backend operate if one of the fields is omitted? Take the value from the other field? What if one of the names isn't available for some reason?
What if on Linux both fields have the same value (a common case), is it OK to just send one of the two fields?
Also, is Windows the only OS where both fields are always the same? Can we have a completely list? If not, should we better not mention "Windows" here as the only affected OS?
Even if we say something like "In cases where process.name
and process.executable.name
are identical, only one of the fields is required.", there is room for interpretation. Why not drop the last sentence, which may lead to confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Linux, when using memfd, there can be processes created with no executable file.
So I don't think the backend/SDKs should be copying the fields when missing or omitting an existent field, since it could cause confusion as to whether process.executable.name
really existed or not.
On the other hand, I agree the fields would commonly be duplicated, so it would be nice have a way to only send one and reduce data volumes. Especially since some workloads can create very large number of processes, so the data volume involving process telemetry can be very large.
brief: > | ||
The name of the process. On Linux based systems, this SHOULD be set to | ||
the value of `/proc/[pid]/comm` or to the `Name` field in `proc/[pid]/status` | ||
(these values are equivalent). On Windows, this SHOULD be set to the | ||
base name of `GetProcessImageFileNameW`. | ||
note: > | ||
The value of this attribute will be equivalent to `process.executable.name` | ||
on Windows, but may not be on Linux. On Linux, the process name from `/proc/[pid]/comm` | ||
is truncated if its name is longer than `TASK_COMM_LEN`-1, and it can be manually | ||
changed by the process itself via [`prctl(2)`](https://man7.org/linux/man-pages/man2/prctl.2.html). | ||
On Windows, it won't be necessary to have both `process.name` and `process.executable.name`, but | ||
it may be on Linux depending on your use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a suggestion for #1737 (comment)
brief: > | |
The name of the process. On Linux based systems, this SHOULD be set to | |
the value of `/proc/[pid]/comm` or to the `Name` field in `proc/[pid]/status` | |
(these values are equivalent). On Windows, this SHOULD be set to the | |
base name of `GetProcessImageFileNameW`. | |
note: > | |
The value of this attribute will be equivalent to `process.executable.name` | |
on Windows, but may not be on Linux. On Linux, the process name from `/proc/[pid]/comm` | |
is truncated if its name is longer than `TASK_COMM_LEN`-1, and it can be manually | |
changed by the process itself via [`prctl(2)`](https://man7.org/linux/man-pages/man2/prctl.2.html). | |
On Windows, it won't be necessary to have both `process.name` and `process.executable.name`, but | |
it may be on Linux depending on your use case. | |
brief: > | |
The name of the process. | |
note: > | |
The attribute represents the best-known friendly process name. When there is | |
no additional context about the process, the SHOULD be obtained from OS-specific API. | |
On Linux based systems, this SHOULD be set to | |
the value of `/proc/[pid]/comm` or to the `Name` field in `proc/[pid]/status` | |
(these values are equivalent). On Windows, this SHOULD be set to the | |
base name of `GetProcessImageFileNameW`. | |
On Linux, the process name from `/proc/[pid]/comm` | |
is truncated if its name is longer than `TASK_COMM_LEN`-1, and it can be manually | |
changed by the process itself via [`prctl(2)`](https://man7.org/linux/man-pages/man2/prctl.2.html). | |
The value of the `process.name` frequently matches the value of the | |
`process.executable.name` attribute. Semantic conventions and | |
instrumentation authors that want to capture a general process name | |
SHOULD use `process.name` attribute and MAY also use `process.executable.name` | |
when additional details are important. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should avoid calling anything the "general process name", or that it belongs in process.name
. There's at least three three different things that might be considered the process name.
process.name
, the name of the actual process/ the runnable process struct in the kernel.process.executable.name
, the name of the executable file that was used to create the process.process.title
, the value fromproctitle
, or the human readable name/title.
It's better to ensure the correct field is used for the correct data, and not suggest it's optional which information goes into the different fields. I think part of the confusion before this change is because the differences between these weren't clear.
Which one is the most important might also depend on the use case. Sometimes the executable file name might be considered more important than the process name.
Seeing lots of good points in the discussions but having trouble figuring out where to take this. Here's my attempt to synthesize the current state:
For the most part I agree with every point of feedback I've seen, but unfortunately that means I agree with some that are that conflict with each other. So I hate to say it, but I'm kinda stuck. I don't have a good idea where to take these attributes from here. I think the only thing I'm confident on is that I'm open to suggestions on what to do with this one. |
Regarding avoiding duplication, we could specify that if |
I don't have a strong opinion, a few things we could consider to get unstuck:
If we go down this road ("process.name" is the best known process name), we'd be:
Either way, prototyping and final stabilization push are usually good time to clean up descriptions and also if we don't believe that some of it is essential for stability, let's just not add it or let's keep it experimental. |
Fixes #1736
Changes
This PR adds a new attribute
process.name
that uses the description that used to apply toprocess.executable.name
. Theprocess.executable.name
attribute's description is adjusted such that the value of the attribute will reliably contain the executable name.Merge requirement checklist
[chore]