-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate all env. vars. before starting injecting env. vars #1141
Validate all env. vars. before starting injecting env. vars #1141
Conversation
pkg/instrumentation/dotnet_test.go
Outdated
}, | ||
} | ||
|
||
for _, test := range tests { | ||
t.Run(test.name, func(t *testing.T) { | ||
pod := injectDotNetSDK(logr.Discard(), test.DotNet, test.pod, 0) | ||
pod, sdkInjectionSkipped := injectDotNetSDK(logr.Discard(), test.DotNet, test.pod, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change the contract to the positive value? sdkInjectionSkipped
-> sdkInjected
. It will requires changes in all places.
BTW current scenario looks like isDisabled=true, which is usually hard to understand and maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to resolve.
…tor into prevent-incomplete-auto-instrumentation
pkg/instrumentation/dotnet.go
Outdated
// caller checks if there is at least one container. | ||
container := &pod.Spec.Containers[index] | ||
|
||
// validate container environment variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that most of the comments are redundant.
IMO // caller checks if there is at least one container.
it is valid comment, but putting information // validate container environment variables.
in front of validateContainerEnv
does not make sense.
Please check whole PR in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to resolve
pkg/instrumentation/dotnet.go
Outdated
// validate container environment variables. | ||
err := validateContainerEnv(container.Env, envDotNetStartupHook, envDotNetAdditionalDeps, envDotNetSharedStore) | ||
if err != nil { | ||
logger.Info("Skipping DotNet SDK injection", "reason:", err.Error(), "container Name", container.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is how logr
should be used
logger.Info("Skipping DotNet SDK injection", "reason:", err.Error(), "container Name", container.Name) | |
logger.Info("Skipping DotNet SDK injection", "reason", err.Error(), "container", container.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to resolve
pkg/instrumentation/dotnet.go
Outdated
err := validateContainerEnv(container.Env, envDotNetStartupHook, envDotNetAdditionalDeps, envDotNetSharedStore) | ||
if err != nil { | ||
logger.Info("Skipping DotNet SDK injection", "reason:", err.Error(), "container Name", container.Name) | ||
return pod, false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if we return error
instead of bool
. The caller is already doing some loggging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pellared I didn't get your point (The caller is already doing some logging).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that as a rule of thumb the function should either log or return an error. Returning false
looks like returning an error without a description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is more, if injectDotNetSDK
would not log, then the logger
would not be need as an argument and the signature would become:
func injectDotNetSDK(dotNetSpec v1alpha1.DotNet, pod corev1.Pod, index int) (corev1.Pod, error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we return the error from injectDotNetSDK
and not log in injectDotNetSDK
, then on the caller side, https://github.com/avadhut123pisal/opentelemetry-operator/blob/2a6569c3b168e9f80dda10916d96579bcf4993d5/pkg/instrumentation/sdk.go#L103
should we use that err
value to just handle the condition and to log the message like this ?
pod, err = injectJavaagent(i.logger, otelinst.Spec.Java, pod, index)
if err != nil {
i.logger.Info("Skipping javaagent injection", "reason", err.Error(), "container", pod.Spec.Containers[index].Name)
return pod
}
Because if want to propagate the error further in the call stack then we need to modify the signature of inject function too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to resolve
pkg/instrumentation/dotnet.go
Outdated
return pod, false | ||
} | ||
|
||
// inject .Net instrumentation spec env vars. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo:
// inject .Net instrumentation spec env vars. | |
// inject .NET instrumentation spec env vars. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to resolve
} | ||
|
||
func trySetEnvVar(logger logr.Logger, container *corev1.Container, envVarName string, envVarValue string, concatValues bool) bool { | ||
// set env var to the container. | ||
func setDotNetEnvVar(container *corev1.Container, envVarName string, envVarValue string, concatValues bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest describing what concatValues
is supposed to offer.
AFAIK it should be set to true
if the env var supports multiple values supported by :
. If it is set to false
, the original container's env var value has priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to resolve
if !trySetEnvVar(logger, &container, envDotNetOTelAutoHome, dotNetOTelAutoHomePath, doNotConcatEnvValues) { | ||
return pod | ||
} | ||
setDotNetEnvVar(container, envDotNetOTelAutoHome, dotNetOTelAutoHomePath, doNotConcatEnvValues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should additionally validate that the dotNetOTelAutoHomePath
env var was not set in the original container. Otherwise, we cannot auto-instrument the .NET app. If someone set it then it would mean that somebody has already set the .NET AutoInstrumentation in the container.
@Kielek do you agree? I think it would be better addressed in a separate issue/PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. We should address this one in separate PR specific to .Net.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #1156. Free free to resolve this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to emit an k8s event if the injection fails.
Yes. I will raise separate PR for that, as that would need changes in other places to get the access to the event Recorder. |
@pellared could you please review as well? |
or @Kielek could you please review? |
pkg/instrumentation/sdk.go
Outdated
pod, err = injectJavaagent(otelinst.Spec.Java, pod, index) | ||
if err != nil { | ||
i.logger.Info("Skipping javaagent injection", "reason", err.Error(), "container", pod.Spec.Containers[index].Name) | ||
return pod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This return
would skip other instrumentations from being processed. I see it as a bug.
PS. It would be good to add a unit test to make sure that such a bug would not be introduced in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there should be only one language instrumentation is required for that particular container
I do not see any docs, code, nor reason that would disallow injecting more language instrumentations. You can have a container that has more than one process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavolloffay Looking at the current implementation, it seems that using multiple instrumentations for the same container will not work. One reason is the duplicate volume mounts, because we use the same mount path. There might be some others things also that can break in context of init container.
I tried adding the annotations for two different language instrumentations, it failed with the error;
Error creating: Pod "spring-petclinic-5d6d58d9b8-pp268" is invalid: spec.containers[0].volumeMounts[2].mountPath: Invalid value: "/otel-auto-instrumentation": must be unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pellared Considering the current implementation (multiple instrumentations for a single pod) I don't think return
statement would cause any issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal we had was to support multiple instrumentations for a single pod
@avadhut123pisal I suggest doing the following:
- Change the
inject
function implementation in a way as if support for multiple instrumentations for a single pod is working (e.g. by usingelse
instead ofreturn pod
in case of an error). - Create an issue.
- Document it in
README.md
as a known issue.
@pavolloffay Does it seem reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal we had was to support multiple instrumentations for a single pod (e.g. one container java and other python etc.)
@avadhut123pisal feel free to book an issue to resolve this limitation if it is important or open a PR to document this in the readme.
sure !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pellared Considering the current implementation (multiple instrumentations for a single pod) I don't think return statement would cause any issue.
The code is written in an unmaintainable way. The "error handling" suggests that it works only for one instrumentation. The "happy path scenario" suggests that it supports multiple instrumentations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pellared Considering the current implementation (multiple instrumentations for a single pod) I don't think return statement would cause any issue.
The code is written in an unmaintainable way. The "error handling" suggests that it works only for one instrumentation. The "happy path scenario" suggests that it supports multiple instrumentations.
Yeah. I got your point :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…avadhut123pisal/opentelemetry-operator into prevent-incomplete-auto-instrumentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (but I was not testing it 😬 )
I am merging this based on the approvals. @avadhut123pisal / @pellared please book the issue to simplify the injection code as discussed before. |
|
…emetry#1141) * skips env var injection and sdk configurations if agent injection is skipped * mutate container at the last of SDK injection step * validate first and then mutate the container with env variables * fixes go lint issues * incorporates review comments * fixes go lint issue * removes return statement in case of failed instrumentation
This PR adds an implementation to validate the environment variables before starting to mutate the actual container. If validation step fails then it skips the next steps related to common environment variables injection and OTEL SDK Configuration.
Closes #1094