-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add k8s.container.status.waiting metric to semantic conventions #1672
Comments
@TylerHelmuth / @ChrsMark / @dmitryax woukd appreciate thoughts / feeedback 🙇 |
Thank's for filing this @povilasv, modeling this as a metric makes sense! There is a proposal on how we should model such "state"/"phase"/"status" metrics in SemConv. It is already in use by My only question here would be how the modeling should actually look like. What you propose aligns with kube-state-metrics but I would like to see how this will be combined with other statuses like FWIWI, If I'm not mistaken we can't have |
Thanks for review, I'll try to work on this next week to model this differently |
I did some experimenting and looked at some of the k8s code (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_pods.go#L2057-L2423) Some examples of the data for different states: Terminated
Running:
Waiting
Terminated is rather short state it quickly goes to "Waiting" state. All those states are exclusive. I think we can model something like this:
Thinks to consider:
FYI Kube-state-metrics did this:
|
Thank's for investigating this @povilasv! We discussed this in the K8s SemConv WG (Wed Jan 8th) and one note here is that we should probably ensure that those attributes should be ready to be re-used in Entities in the future. Based on the above I think we need to define the attributes explicitly in the attributes registry under a full meaningful namespace like As far as having 1 or 2 different metrics is concerned, I don't have a strong preference here. Having them under 1 metric is more tidy but less flexible in case we want to change only one of them in the future. I'm not sure if this decision actually affects cardinality. @povilasv is there any reason that you found on why KSM splits the states in different metrics? For example the |
I've looked thru PRs and issues and couldn't find why. they've been like that since the beginning. Ref:
I guess, there is one problem that "running" state doesn't have a reason. So we will either have to do set reason to empty string like One interesting point about reasons - KSM white lists only certain reasons to prevent cardinality explosion:
I think maybe we should do the same? Thoughts @ChrsMark ? :) |
Sounds good!
Yeap, limiting this with an enum would make sense! |
Area(s)
area:k8s
Is your change request related to a problem? Please describe.
K8s Cluster receiver uses would like to monitor K8s CrashLoop Back off state. We had a multiple issues about it:
Previously I tried to model
status.waiting
as Resource attribute, but because it's mutable it cannot be Resource attribute. See #997I would like to propose adding this as metric to unblock this PR open-telemetry/opentelemetry-collector-contrib#35668
Describe the solution you'd like
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: