Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Containers telemetry #41747

Merged
merged 11 commits into from
Jun 25, 2024
Merged

Conversation

baronfel
Copy link
Member

@baronfel baronfel commented Jun 22, 2024

Fixes dotnet/sdk-container-builds#539

Adds telemetry for base image inference, container publishing, and a variety of errors that can occur along the process.

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Containers Related to dotnet SDK containers functionality untriaged Request triage from a team member labels Jun 22, 2024
@baronfel baronfel marked this pull request as ready for review June 23, 2024 19:56
@baronfel
Copy link
Member Author

baronfel commented Jun 23, 2024

Ok, I'm pretty happy with this now. We now send telemetry for container publishes for three main scenarios:

  • container inference - to help us diagnose our decisions during inference
  • container publish success - to know more about what kinds of registries and stores the images are being pushed to and pulled from, to help us diagnose and prioritize fixes to those different stores
  • container publish failures - to help bucket the kinds of failures (authentication, logic error, layer digest errors, etc) to help triage more effectively and correlate errors with the kind of storage (in case some storages are more failure prone than others and might need retries, etc.

I've verified the data is allowed through the CLI's allowlist mechanisms, but haven't yet been able to verify it showing up in our telemetry systems.

All telemetry adheres to the CLI's existing opt-out mechanisms and we do not long user-provided information - we only log registry, image name, and tag information when we can verify that it's using the mcr.microsoft.com base images. In all other cases that data is null.

I'd like to get this in this week, because the 8.0.4xx freeze occurs soon.

@baronfel baronfel requested a review from a team June 23, 2024 20:07
@baronfel baronfel added this to the 8.0.4xx milestone Jun 23, 2024
Copy link
Member

@nagilson nagilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems wise to collect this information so we can better understand customer needs. I have mostly questions but it looks good to me otherwise :)

@baronfel
Copy link
Member Author

I've verified that the telemetry data is being sent correctly - after merge GDPR classification will need to occur to get this 'done' done.

@baronfel baronfel merged commit 71ebd47 into dotnet:release/8.0.4xx Jun 25, 2024
17 checks passed
@baronfel baronfel deleted the containers-telemetry branch June 25, 2024 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Containers Related to dotnet SDK containers functionality untriaged Request triage from a team member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants