-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect more build metrics #65408
Collect more build metrics #65408
Conversation
packages/kbn-dev-utils/src/ci_stats_reporter/ci_stats_reporter.ts
Outdated
Show resolved
Hide resolved
return { | ||
name: 'distributable size', | ||
subName: Path.basename(path), | ||
value: (await asyncStat(path)).size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it collect stats per archive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the basename of each archive is unique and includes things like the os, architecture, and archive format. We currently only create archives for the current platform on CI but the change in size should be pretty consistent between the distributable for each platform so I'm not particularly concerned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should standardize on some formats, ala ECS. For instance, version and platform would be easier to search for in this instance than having to figure out what the filename is. It also allows for us to apply a filter to a dashboard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think searching by version or platform will be very useful here though. And I'm not opposed to using a standardized format, but ECS doesn't feel appropriate for the intermediate representation we're creating here.
If anything I think we might want our groups to be
${oss ? 'oss ' : ''}distributable size
and our metric ids to be the archive type. Version/branch information is really a property of the job and less a property of the metric.
The way I think of the data we're collecting here is how it will look as a table formatted as:
${group}
id value Change from {mergeBase} ${id} ${valueAsMB} ${difference}
Once we're done with getting tables like that reporting on PRs we will work on putting this data into a format compatible with searching/filtering by summarizing the data into a single document and then indexing it into kibana-stats. With that it will be easy to have something like a chart of the average "distributable size" broken down by archive type, base branch, and spreading it out over time. That said, slower and longer term reporting/trends is something we plan to tackle after we get reporting into each PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented alternate metric group/id in 5e122d2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sync'd with Spencer on this - the portion I was missing was the summary document will contain the version/etc. 👍
files.map(async path => { | ||
return { | ||
name: 'distributable size', | ||
subName: Path.basename(path), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it contain a version number in the path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, example:
{
"name": "distributable size",
"subName": "kibana-oss-8.0.0-SNAPSHOT-linux-x86_64.tar.gz",
"value": 161180435,
"buildId": "f546075f-1a47-4a31-8e80-28d69e2a8f27"
}
@@ -35,7 +39,23 @@ export function reportOptimizerStats(reporter: CiStatsReporter, name: string) { | |||
} | |||
|
|||
if (n.kind === 'C' && lastState) { | |||
await reporter.metric('@kbn/optimizer build time', name, lastState.durSec); | |||
await reporter.metrics([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, it should be a completely separate step in the build pipeline? In the next step we might want to add the next functionality without cluttering the file:
- download the size stats for the
upstream branch
- compare it with ones for the current PR
- report diff in the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're planning on doing that either from the pipeline code itself, or from the service, but only once we have some more data to understand what we will report and how. The goal of the current design is to make it easy to report metrics from many locations and summarize them in a final step somewhere else. This is an example of one location where we report metrics, but there will be several (maybe many) others.
Pinging @elastic/kibana-operations (Team:Operations) |
@elasticmachine merge upstream |
@elasticmachine merge upstream |
return { | ||
group: `${build.isOss() ? 'oss ' : ''}distributable size`, | ||
id: format, | ||
value: (await asyncStat(path)).size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If what format & units we report 😅size
? I hope we won't have to convert GB <--> MB in the storage
ok, it's in bytes https://nodejs.org/api/fs.html#fs_stats_size
💚 Build SucceededHistory
To update your PR or re-run it, just comment with: |
This PR implements some metric collection for additional data that will be useful to track over time. So far we have only tracked the build time of each optimizer run, as a test to see how metric collection would impact CI. Since we haven't seen any negative impact we would like to start collecting more metrics that will help us ensure we don't slide back on our efforts to reduce the size of bundles being served.