Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unset and empty environment variables are equivalent #2045

Merged
merged 16 commits into from
Nov 4, 2021

Conversation

pellared
Copy link
Member

@pellared pellared commented Oct 20, 2021

Changes

Describe how an empty value of an environment variable should be interpreted.

Related PR: #2052

Why

The problem is that not all languages/OSes/shells are handling an empty value in the same way.

In .NET (C# etc) setting an empty value unsets the environmental variable (sic!). See: https://docs.microsoft.com/en-us/dotnet/api/system.environment.setenvironmentvariable?view=net-5.0

On Windows setting an empty env var is VERY tricky. I suspect 95% of Windows IT users may have real problems doing it. You cannot do it in the command prompt. not via PowerShell, not via GUI. It is possible via Windows Registry or WinAPI, but it is VERY inconvenient and hacky.

Also from a pure usability point of view, echo $SOMEVAR does not allow to distinguish between empty and unset.

Having a different interpretation of an empty and unset environment variable can be seen as a kind of "billion-dollar mistake" 😉

Reference: https://unix.stackexchange.com/questions/27708/is-there-a-difference-between-setting-an-environment-variable-to-the-empty-strin

TODOs after this PR

@pellared pellared requested review from a team October 20, 2021 14:49
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we accept this change the entire list of existing env variables must be scanned to verify if any of these variables have an expectation that the empty variable can have a particular meaning compared to the unset value.
An example of such env variable was the exporter compression setting, which previously could be set to empty string to mean "no compression" and which we updated to have "none" to represent the "no compression" meaning instead.
There may be other env variables which may exhibit the same behavior and in that case this case actually is a breaking change which requires more though.
@pellared will you be able to do that verification and confirm that we are not breaking anything?

P.S. I had a quick look and OTEL_SERVICE_NAME for example is unclear. IIRC if OTEL_SERVICE_NAME unset the SDK set the service.name to "unknown". So if you wanted an empty service.name you could do that by defining OTEL_SERVICE_NAME="". After this change it is no longer possible.
It is likely not important for "service.name" since I don't think an empty name is a useful valid value for it, but there may be other such variables where it actually is important to be able to set an empty string as a value because the default is not empty.

@pellared
Copy link
Member Author

pellared commented Oct 20, 2021

@tigrannajaryan Thanks for your feedback

Before we accept this change the entire list of existing env variables must be scanned to verify if any of these variables have an expectation that the empty variable can have a particular meaning compared to the unset value.

That is one of the main reasons why I propose the SHOULD to allow exceptions and rather use the spec as a recommendation.

As an example OTEL_SERVICE_NAME="" I am not aware of any SDK that would set this with an empty name. At least Go and .NET SDK will treat it in the same way as unset and use unknown value.

❗ Take also into consideration that for example on Windows it is (almost?) NOT possible to set an empty env var. Not in the command prompt. not via PowerShell, not via GUI. Maybe it is possible via Windows Registry or WinAPI, but I am not sure and it would be VERY inconvenient.

I can make a follow-up PR proposing MUST and if it would be merged then I can create issues in all SDKs. Or I can leave the SIG to handle it. Whatever works for you 😉

@tigrannajaryan
Copy link
Member

As an example OTEL_SERVICE_NAME="" I am not aware of any SDK that would make set this as an empty name. At least Go and .NET SDK will treat it in the same way as unset.

Fair enough. I didn't check the actual behavior but it did appear like it could matter. I want to make sure there are no other env variables where it does actually matter. I suppose it is unlikely that any SDK implementations make a difference between unset and empty value but it is certainly technically possible, so better to double check.

Take also into consideration that for example on Windows it is (almost?) NOT possible to set an empty env var. Not in the command prompt. not via PowerShell, not via GUI. Maybe it is possible via Windows Registry or WinAPI, but I am not sure and it would be VERY inconvenient.

I think this confirms one more time that we need to revise all env variables to make sure there are none where being able to set an empty value is an important use case and then we should accept this PR.

@pellared
Copy link
Member Author

@tigrannajaryan

I think this confirms one more time that we need to revise all env variables to make sure there are none where being able to set an empty value is an important use case and then we should accept this PR.

The only issue I found was OTEL_PROPAGATORS which I addressed in this PR. But it would be great if someone carefully double-checks it.

@pellared
Copy link
Member Author

@owais
Copy link
Contributor

owais commented Oct 20, 2021

Thanks for the mention @pellared. Python SDK implements Attribute Limits and it treats empty but set values as infinity (does not fallback to default). This is what the spec recommends if I understand it correctly.

https://github.com/open-telemetry/opentelemetry-specification/blob/613bf2c11babef66a17fafbd51e65ae3f8174758/specification/sdk-environment-variables.md#attribute-limits

@owais
Copy link
Contributor

owais commented Oct 20, 2021

Luckily neither of the the two fields specify a default so if we make this change, we can also update attribute spec to set the default for these to "infinity". That way it shouldn't be a breaking change?

@pellared
Copy link
Member Author

pellared commented Oct 21, 2021

@owais
I think that the spec is not clear here.

This is how I understand it: the default value of *_LENGTH_LIMIT is empty which is interpreted as infinite.

Maybe we should put - there instead of nothing and add some legend that - means empty or unset value?

Did ☝️ this here: dd5d00d

The other option is to put -1 as default and interpret it as infinite as well. E.g. in .NET SDK the timeouts set to -1 are interpreted as infinite. We could say in the "numeric type" that -1 should be interpreted as infinite if applicable. However I do not think it is good to start with it

@owais
Copy link
Contributor

owais commented Oct 21, 2021

For example, OTEL_SPAN_EVENT_COUNT_LIMIT has a default of 128. How do you set it to infinity according to current spec? Reading everything on that page leads me to believe the spec expects users to set the value to an empty string in order to set it to infinity. That is how Python implements it today. If env is not set, it takes default value, if it is set to empty string, it becomes infinity.

I agree that instead of empty value we should use something like none or unset. Just wanted to point out current use cases and implementations.

@pellared
Copy link
Member Author

pellared commented Oct 21, 2021

For example, OTEL_SPAN_EVENT_COUNT_LIMIT has a default of 128. How do you set it to infinity according to current spec? Reading everything on that page leads me to believe the spec expects users to set the value to an empty string in order to set it to infinity. That is how Python implements it today. If env is not set, it takes default value, if it is set to empty string, it becomes infinity.

I agree that instead of empty value we should use something like none or unset. Just wanted to point out current use cases and implementations.

-1 is an allowed value for a numeric type and can be used to set infinity.

Trying to address it here: ce2381c

@mateuszrzeszutek
Copy link
Member

As an example OTEL_SERVICE_NAME="" I am not aware of any SDK that would set this with an empty name. At least Go and .NET SDK will treat it in the same way as unset and use unknown value.

The Java SDK allows setting an empty name: https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk-extensions/autoconfigure/src/main/java/io/opentelemetry/sdk/autoconfigure/EnvironmentResource.java#L43
We only check for the null value (which means there was no configuration at all), empty string is treated as a valid service name.
I think that probably most (all?) of the string-typed properties used by the SDK/agent treat null and empty values differently.

@pellared
Copy link
Member Author

@mateuszrzeszutek Do you think that changing this behavior in Java SDK/AutoInstr could be dangerous or even seen as something unwelcome?

@mateuszrzeszutek
Copy link
Member

@mateuszrzeszutek Do you think that changing this behavior in Java SDK/AutoInstr could be dangerous or even seen as something unwelcome?

Truth be told, I have no idea. I doubt that anybody's willingly using "" as a valid value for anything.
Also CC @anuraaga @jkwatson

@anuraaga
Copy link
Contributor

I think it is possible for variables to exist where a value is an empty string. I think the env var spec behaves similarly to URL query parameters for example, where a name without a value means something.

If we deny empty variables from having a value I think it's basically ok. But the corner case to consistent is map values like otel.resource.attributes, where it seems like animal=cat,human= should result in empty string for human which would probably be treated as true.

I suspect implementations have treated this per-key so far and probably can without any significant UX issues. UX issues aren't a big deal anymore anyways since OTel engineering values settled on not including the user.

@pellared
Copy link
Member Author

The problem is that not all languages/OSes/shells are handling an empty value in the same way.

In .NET (C# etc) setting an empty value unsets the environmental variable (sic!). See: https://docs.microsoft.com/en-us/dotnet/api/system.environment.setenvironmentvariable?view=net-5.0

On Windows setting an empty env var is VERY tricky. I suspect 95% of Windows IT users may have real problems doing it.

For me having a different interpretation of an empty and unset environment variable is a kind of "billion-dollar mistake" 😉

I would be happy if the specs would at least encourage us to interpret empty and unset environment variables in the same way. However, maybe it makes even sense to require it?

I am mostly worried about a scenario where setting an empty value may be an important scenario. E.g. OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT= in the case of Python SDK.

@yurishkuro
Copy link
Member

The problem is that not all languages/OSes/shells are handling an empty value in the same way.

+1. Also from a pure usability point of view, echo $SOMEVAR does not allow to distinguish between empty and unset. IMO it's best to avoid unset != blank. Overriding defaults can be done with concrete values (the "infinity" arguments don't make much sense to me - when do we deal with infinity? There's always some limit somewhere).

@tigrannajaryan
Copy link
Member

The problem is that not all languages/OSes/shells are handling an empty value in the same way.

+1. Also from a pure usability point of view, echo $SOMEVAR does not allow to distinguish between empty and unset. IMO it's best to avoid unset != blank. Overriding defaults can be done with concrete values (the "infinity" arguments don't make much sense to me - when do we deal with infinity? There's always some limit somewhere).

I agree. We need to make sure there are no env variables where there is an expectation that unset and set to empty trigger a different behavior. It means that any variable where an empty string is a meaningful choice MUST also have as a default value an empty string.

I wonder what do we do if we discover such env vars which violate this rule and are already declared Stable. OTEL_SERVICE_NAME was the only one that I found which appears to violate the rule because the SDKs set it to "unknown" by default. However this is not currently defined by the spec, SDKs do that at their own risk, so perhaps we eliminate the "unknown" by default behavior and do not consider it a breaking change. If OTEL_SERVICE_NAME is unspecified then the service name will be empty. I don't know how much of a problem that is though.

@pellared pellared requested a review from iNikem October 24, 2021 08:33
@carlosalberto
Copy link
Contributor

We have enough approvals but I'd like to double verify that at least @open-telemetry/java-maintainers and @open-telemetry/javascript-maintainers are fine with this ;)

@jack-berg
Copy link
Member

Not a java maintainer, but for the most part, this is how the java sdk works today. The DefaultConfigProperties class is the type safe accessor for all env var configuration, and almost all the types treat empty values as null, which is the same as not set.

One exception is the getString() method. But looking at its usage, it should be fine to check for empty and return null to make it aligned with the other accessors.

Copy link
Member

@obecny obecny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for js

@tigrannajaryan
Copy link
Member

Please also rebase from main to make ready for merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.