Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool for doing gap analysis of TSDB blocks #8453

Merged
merged 8 commits into from
Aug 27, 2024
Merged

Conversation

aldernero
Copy link
Contributor

@aldernero aldernero commented Jun 20, 2024

What this PR does

This tool analyzes TSDB blocks for gaps in series data. The README has a good description with the syntax and examples. The purpose is to process every sample for a given set of series in a block and check for missing samples based on flexible filters.

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@dimitarvdimitrov
Copy link
Contributor

can you add some docs and a description to the PR? Other tools have a docs file in docs/internal/tools. It's hard to review as-is.

@aldernero aldernero marked this pull request as ready for review June 25, 2024 03:48
@aldernero aldernero requested a review from a team as a code owner June 25, 2024 03:48
@dimitarvdimitrov dimitarvdimitrov self-requested a review July 23, 2024 08:18
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good. I'd use PostingsForMatchers and break up analyzeBlockForGaps into multiple functions. I guess we can go around infinitely cleaning this up, but breaking it up should make it easier for someone to dive in two years from today and make a meaningful change without too much hassle.

End: timestamps[i+1],
Intervals: int(missedSamples),
})
seriesStats.MissedSamples += uint64(missedSamples)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we increment missed samples for the series even if the gap itself is uninteresting? Otherwise MissedSamples is just a sum of all the reported missed samples in all gaps

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, there's one unresolved thread, but I assume you'd prefer to keep MissedSamples with the same value regardless of which gaps are considered interetesting

@aldernero aldernero merged commit e0fa677 into main Aug 27, 2024
29 checks passed
@aldernero aldernero deleted the aldernero/tsdb-gaps branch August 27, 2024 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants