-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update out of order with main #2072
Merged
jesusvazquez
merged 193 commits into
out-of-order
from
jvp/update-out-of-order-with-main
Jun 10, 2022
Merged
Update out of order with main #2072
jesusvazquez
merged 193 commits into
out-of-order
from
jvp/update-out-of-order-with-main
Jun 10, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…o binaries. (#1759) * Extend Dockerfiles to support multiarch builds for all Go binaries. By calling any of make push-multiarch-./cmd/metaconvert/.uptodate make push-multiarch-./cmd/mimir/.uptodate make push-multiarch-./cmd/query-tee/.uptodate make push-multiarch-./cmd/mimir-continuous-test/.uptodate make push-multiarch-./cmd/mimirtool/.uptodate make push-multiarch-./operations/mimir-rules-action/.uptodate Signed-off-by: Peter Štibraný <[email protected]>
* Update to latest dskit and memberlist fork Fixes #1743 Signed-off-by: Nick Pillitteri <[email protected]> * Update changelog Signed-off-by: Nick Pillitteri <[email protected]>
Signed-off-by: Mauro Stettler <[email protected]>
* mimirtool config: Add more retained old defaults The following parameters have their old defaults retained even when `--update-defaults` is used with `mimirtool config covert`: * `activity_tracker.filepath` * `alertmanager.data_dir` * `blocks_storage.filesystem.dir` * `compactor.data_dir` * `ruler.rule_path` * `ruler_storage.filesystem.dir` * `graphite.querier.schemas.backend` (only in GEM) These are filepaths for which the new defaults don't make more sense than the old ones. In fact updating these can lead to subpar migration experience because components start using directories that don't exist. Because activity_tracker.filepath changed its name since cortex the tests needed to allow for differentiating old common options and new ones. This is something that was already there for GEM and was added for cortex/mimir too. Signed-off-by: Dimitar Dimitrov <[email protected]> * Update CHANGELOG.md Signed-off-by: Dimitar Dimitrov <[email protected]>
* dashboards: add flag to skip gateway The gateway component seems to be an enterprise component, so groups that aren't running enterprise shouldn't need the empty panels and rows in their dashboards. This patch adds a flag to drop gateway-related widgets from the mixin dashboards. Signed-off-by: Josh Carp <[email protected]> * Update CHANGELOG.md Co-authored-by: Marco Pracucci <[email protected]>
* Gracefully shutdown querier when using query-scheduler Signed-off-by: Marco Pracucci <[email protected]> * Fixed comment Signed-off-by: Marco Pracucci <[email protected]> * Added TestQueuesOnTerminatingQuerier Signed-off-by: Marco Pracucci <[email protected]> * Commented executionContext Signed-off-by: Marco Pracucci <[email protected]> * Added CHANGELOG entry Signed-off-by: Marco Pracucci <[email protected]> * Update pkg/querier/worker/util.go Co-authored-by: Peter Štibraný <[email protected]> * Fixed typo in suggestion Signed-off-by: Marco Pracucci <[email protected]> * Removed superfluous time sensitive assertion Signed-off-by: Marco Pracucci <[email protected]> * Commented newExecutionContext() Signed-off-by: Marco Pracucci <[email protected]> Co-authored-by: Peter Štibraný <[email protected]>
* Graceful shutdown querier with not using query-scheduler Signed-off-by: Marco Pracucci <[email protected]> * Updated CHANGELOG entry Signed-off-by: Marco Pracucci <[email protected]> * Improved comment Signed-off-by: Marco Pracucci <[email protected]> * Refactoring Signed-off-by: Marco Pracucci <[email protected]>
* Increase mimir-continuous-test query timeout from 30s to 60 Signed-off-by: Marco Pracucci <[email protected]> * Added PR number to CHANGELOG entry Signed-off-by: Marco Pracucci <[email protected]>
* Increased default -tests.run-interval from 1m to 5m Signed-off-by: Marco Pracucci <[email protected]> * Added PR number to CHANGELOG entry Signed-off-by: Marco Pracucci <[email protected]>
* Fix flaky tests on querier graceful shutdown Signed-off-by: Marco Pracucci <[email protected]> * Remove spurious newline Signed-off-by: Marco Pracucci <[email protected]>
* Update build-image to use golang:1.17.8-bullseye, and add skopeo to build image. Skopeo will be used in subsequent PR to push multiarch images. Signed-off-by: Peter Štibraný <[email protected]> * Update build image. Use ubuntu-latest for workflow steps. Signed-off-by: Peter Štibraný <[email protected]>
* Publish multiarch images. Signed-off-by: Peter Štibraný <[email protected]> * Tag with extra tag, if pushing tagged commit or release. Signed-off-by: Peter Štibraný <[email protected]> * Split building of docker images and archiving them into tar. Signed-off-by: Peter Štibraný <[email protected]> * When tagging with test, use --all. Signed-off-by: Peter Štibraný <[email protected]> * Only run deploy step on tags or weekly release branches. Signed-off-by: Peter Štibraný <[email protected]> * Don't tag with test anymore. Signed-off-by: Peter Štibraný <[email protected]> * Address review feedback. Signed-off-by: Peter Štibraný <[email protected]> * Fix license check. Signed-off-by: Peter Štibraný <[email protected]>
When using `K6_HA_REPLICAS > 1`, Mimir will accept all HTTP calls but a part of those call will receive a status code `202`. The following commit makes this status code as expected otherwise user receive the following error: ``` reads_inat write (file:///.../mimir-k6/load-testing-with-k6.js:254:8(137)) reads_inat native executor=ramping-arrival-rate scenario=writing_metrics source=stacktrace ERRO[0015] GoError: ERR: write failed. Status: 202. Body: replicas did not mach, rejecting sample: replica=replica_1, elected=replica_0 ``` At the end of the benchmark summary display errors: ``` ✗ write worked ↳ 20% — ✓ 23 / ✗ 92 ``` Example of load testing: ```shell ./k6 run load-testing-with-k6.js \ -e K6_SCHEME="https" \ -e K6_WRITE_HOSTNAME="${mimir}" \ -e K6_READ_HOSTNAME="${mimir}" \ -e K6_USERNAME="${user}" \ -e K6_WRITE_TOKEN="${password}" \ -e K6_READ_TOKEN="${password}" \ -e K6_HA_CLUSTERS="1" \ -e K6_HA_REPLICAS="3" \ -e K6_DURATION_MIN="5" ``` Signed-off-by: Wilfried Roset <[email protected]>
* implement read v2 * updated CHANGELOG.md * extend maxBytesInFram comment. * addressed PR feedback * addressed PR feedback * addressed PR feedback * use indexed xor chunk function to assert stream remote read tests * updated CHANGELOG.md Co-authored-by: Miguel Ángel Ortuño <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…e is 0 (#1783) Signed-off-by: Marco Pracucci <[email protected]>
* Print version+arch of Mimir loaded to Docker. Signed-off-by: Peter Štibraný <[email protected]> * Use debug log for distributor. Signed-off-by: Peter Štibraný <[email protected]>
…ortex_distributor_ingester_query_failures_total (#1797) * Remove unused metrics cortex_distributor_ingester_queries_total and cortex_distributor_ingester_query_failures_total Signed-off-by: Marco Pracucci <[email protected]> * Remove unused fields Signed-off-by: Marco Pracucci <[email protected]>
* Added options support to SendSumOfCountersPerUser() Signed-off-by: Marco Pracucci <[email protected]> * Renamed SkipZeroValueMetrics() to WithSkipZeroValueMetrics() Signed-off-by: Marco Pracucci <[email protected]>
… to let people install both while migrating from Cortex to Mimir (#1801) Signed-off-by: Marco Pracucci <[email protected]>
…1808) Signed-off-by: Marco Pracucci <[email protected]>
Allow customizing mimir cli flags per zone for the store gateway. Copied the same solution as we have for ingesters. Signed-off-by: György Krajcsovits <[email protected]>
…n the ring (#1806) * Add protection to store-gateway to not drop all blocks if unhealthy in the ring Signed-off-by: Marco Pracucci <[email protected]> * Added CHANGELOG entry Signed-off-by: Marco Pracucci <[email protected]> * Update CHANGELOG.md Co-authored-by: Peter Štibraný <[email protected]> Co-authored-by: Peter Štibraný <[email protected]>
…tor_ingester_append_failures_total unused metrics (#1799) Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
* Extract and test TracerTransport functionality We need to use a TracerTransport in mimir-continous-test. We have that in the frontend package, but I don't want to import frontend from the mimir-continous-test, so we extract it to util/instrumentation. Signed-off-by: Oleg Zaytsev <[email protected]> * Set up global tracer in mimir-continuous-test Signed-off-by: Oleg Zaytsev <[email protected]> * Add tracing to the client and spans to the tests Signed-off-by: Oleg Zaytsev <[email protected]> * Add jaeger-mixin to mimir-continuous test container Signed-off-by: Oleg Zaytsev <[email protected]> * make license Signed-off-by: Oleg Zaytsev <[email protected]> * Add traces to the write path Signed-off-by: Oleg Zaytsev <[email protected]> * Update CHANGELOG.md Signed-off-by: Oleg Zaytsev <[email protected]>
* Removed unused Info() and advLabelSets from BucketStore Signed-off-by: Marco Pracucci <[email protected]> * Removed unused FilterConfig from BucketStore Signed-off-by: Marco Pracucci <[email protected]> * Removed unused relabelConfig from store-gateway tests Signed-off-by: Marco Pracucci <[email protected]> * Removed unused function expectedTouchedBlockOps() Signed-off-by: Marco Pracucci <[email protected]> * Removed unused recorder from BucketStore tests Signed-off-by: Marco Pracucci <[email protected]> * go mod vendor Signed-off-by: Marco Pracucci <[email protected]>
* Upgrade alpine to 3.16.0 * Enhance MimirRequestLatency runbook with more advice (#1967) * Enhance MimirRequestLatency runbook with more advice Signed-off-by: Arve Knudsen <[email protected]> Co-authored-by: Marco Pracucci <[email protected]> * Include helm-docs in build and CI (#2026) * Update the mimir build image and its build doc Dockerfile: Add helm-docs package to the image. how-to: Write down the requirements for build in more detail. Add information about build on linux. Signed-off-by: György Krajcsovits <[email protected]> * Expand make doc with helm-docs command This enables generating the helm chart README with the same make doc command as all other documentation. Signed-off-by: György Krajcsovits <[email protected]> * Update docs/internal/how-to-update-the-build-image.md Co-authored-by: Dimitar Dimitrov <[email protected]> * Update contributing guides for the helm chart (#2008) * Update contributing guides for the helm chart Signed-off-by: György Krajcsovits <[email protected]> * Turn off helm version increment check in CI This enables periodic releases, as opposed to requiring version bump for release at every PR. Signed-off-by: György Krajcsovits <[email protected]> * Add extraEnvFrom to all services and enable injection into mimir config (#2017) Add `extraEnvFrom` capability to all Mimir services to enable injecting secrets via environment variables. Enable `-config.exand-env=true` option in all Mimir services to be able to take secrets/settings from the environment and inject them into the Mimir configuration file. Signed-off-by: György Krajcsovits <[email protected]> * Docs: fix mimir-mixin installation instructions (#2015) Signed-off-by: Marco Pracucci <[email protected]> * Docs: make documentation a first class citizen in CHANGELOG (#2025) Signed-off-by: Marco Pracucci <[email protected]> * upgrade to alpine 3.16.0 * upgrade alpine to 3.16.0 Co-authored-by: Arve Knudsen <[email protected]> Co-authored-by: Marco Pracucci <[email protected]> Co-authored-by: George Krajcsovits <[email protected]> Co-authored-by: Dimitar Dimitrov <[email protected]>
This should be automated, but now done manually. Signed-off-by: György Krajcsovits <[email protected]>
The default value, shared with all other memcache caches, of 200ms is too aggressive in most cases. This results in TSDB data often being fetched from object storage in cases where a slighly longer timeout would result in a cache hit. This is set in Jsonnet and Helm instead of as a default of the CLI flag since the flags (and hence their defaults) are shared among all caches (index, chunks, metadata, results). Signed-off-by: Nick Pillitteri <[email protected]>
* Add test-enterprise-values.yaml
…ry sharding is enabled (#2036) Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
* Renamed newDiscoveryService() to newMimirDiscoveryService() Signed-off-by: Marco Pracucci <[email protected]> * Added newMimirPdb() utility Signed-off-by: Marco Pracucci <[email protected]> * Added newMimirStatefulSet() utility Signed-off-by: Marco Pracucci <[email protected]>
* Helm: Add golden-record build script * Helm: add test-values golden record * Add PR check for `check-helm-tests` * Add Helm setup to lint-helm action * Update generated helm tests * Fix bash linting * Update contribution guidelines * Update generated helm manifests * Helm: fix kube version Set kubeVersionOverride to generate PodDisruptionBudget API version consistently. When I ran the test, I got a diff, because my k8s is newer (1.23). Signed-off-by: György Krajcsovits <[email protected]> * Update operations/helm/tests/build.sh Co-authored-by: Dimitar Dimitrov <[email protected]> Co-authored-by: György Krajcsovits <[email protected]>
… indexheader reads. (#2019) Introduces a new experimental configuration option (`-blocks-storage.bucket-store.index-header.map-populate-enabled`). This enables the use of the `MAP_POPULATE` flag when `mmap`-ing index-header files in the store-gateway. What this flag does is advise the kernel to (synchronously) pre-fault all pages in the memory region, loading them into the file system cache. Why is this a good idea? - The initial read process of the index-header files has shown to cause hangups in the store-gateway. - By using this option, I/O is done in the mmap() syscall, which the Go scheduler can cope with. - We reduce the likelyhood of Goroutines getting stalled in major page faults. - The initial read process walks the entire file anyway, so we are not doing any more I/O. - It's a very low risk change compared to re-writing the BinaryReader (work in progress). Why is this not perfect? - The Kernel does not guarantee the pages will stay in memory, so we are only reducing the probability of major page faults. Rationale about the implementation: - I have copied the mmap utilities from Prometheus as a temporary measure, for the sake of evaluating this change.
Signed-off-by: Peter Štibraný <[email protected]>
* Update Prometheus with async chunk mapper changes. Included changes: grafana/mimir-prometheus#131 grafana/mimir-prometheus#247 These result is lower memory usage by chunk mapper. Signed-off-by: Peter Štibraný <[email protected]>
* Fix ruler config in getting started guide Signed-off-by: Marco Pracucci <[email protected]> * Added CHANGELOG entry Signed-off-by: Marco Pracucci <[email protected]>
A previous change (#2019) assumed MAP_POPULATE was available on Darwin. This fixes the build.
…#1949) * mixin: adapt alerts/playbooks to have into consideration ruler query path components. Signed-off-by: Miguel Ángel Ortuño <[email protected]> * applied PR suggestion Signed-off-by: Miguel Ángel Ortuño <[email protected]> * applied PR suggestion Signed-off-by: Miguel Ángel Ortuño <[email protected]> * restored ruler missed evaluations alert Signed-off-by: Miguel Ángel Ortuño <[email protected]> * updated CHANGELOG.md Signed-off-by: Miguel Ángel Ortuño <[email protected]>
* Return and log detailed services information on /ready This helps debug starting services more easily. Signed-off-by: Dimitar Dimitrov <[email protected]> * Only return non-running services Signed-off-by: Dimitar Dimitrov <[email protected]>
…2009) * add validation.RateLimited to error catalogue * Add validation.TooManyHAClusters to error catalogue * update docs * Apply suggestions from code review Co-authored-by: Marco Pracucci <[email protected]> * improve new MessageWithLimitConfig and add tests * Apply suggestions from code review Co-authored-by: Marco Pracucci <[email protected]> * Update from changes in code review Co-authored-by: Marco Pracucci <[email protected]>
* Add Patrick Oyarzun as Team Member * Update MAINTAINERS.md
* mimir-continuous-test: Add smoke test mode * Add PR number to CHANGELOG * Update error assertions in write_read_series_test * Fix doc formatting * Address PR feedback * Fix goimports formatting
Signed-off-by: Marco Pracucci <[email protected]>
* Make MessageWithLimitConfig accept multiple flags * Add tenant string in per-tenant error labels * Revert "Add tenant string in per-tenant error labels" This reverts commit 758ef72. * rename error too-many-ha-clusters
* ruler: report failed eval on any 5xx status Signed-off-by: Miguel Ángel Ortuño <[email protected]> * addressed PR suggestion Signed-off-by: Miguel Ángel Ortuño <[email protected]>
The OOO implementation changed the ChunkReader interface. Mimir imports Thanos and there are issues with the changes on that interface so we had to fork Thanos to perform the interface change. We'll try to upstream this soon enough so that we dont need to do this in the future.
c5cc520
to
a2bb750
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
mimir/out-of-order was based on a commit from April 26. This PR updates it to the latest main commit.