Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stats endpoint to GeoIpDownloader #70282

Merged
merged 32 commits into from
Mar 23, 2021
Merged

Conversation

probakowski
Copy link
Contributor

This change adds _geoip/stats endpoint that can be used to collect basic data about geoip downloader (successful, failed and skipped downloads, current db count and total time spent downloading).
It also fixes missing/wrong origins for clients that will break if used with security.

Relates to #68920

@probakowski probakowski requested a review from martijnvg March 11, 2021 00:26
Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the information reported in GeoIpDownloaderStats looks good.

However I think it be also great if this api also returns information about each DatabaseRegistry running on each ingest node. For example report about the available lazy loaders and perhaps report what files are in the geoip tmp directories. I think this can make debugging easier. Wdyt?

If we would do this then this does mean that the transport action would need to extend something TransportNodesAction class (which would help sending an internal request to each ingest node). And as part of this api we would also make a call to list tasks action in order to get stats about the geoip downloader.

original-brownbear and others added 14 commits March 11, 2021 12:43
The wait for green check times out in this test because the datastream
that is backing the SLM history store is running into a 60s delayed allocation
when the test stops a data node that holds some of its shards.
Fixed by turning off the delayed allocation in this test.
It appears this only affects this one test because it's the only test that
creates a policy and executes it before stopping a node

closes elastic#70185
This PR uses longs to compute the number of tiles touching a bounding box
The node attribute transform.node has been introduced in elastic#52712 (7.7) to implement dedicated
transform nodes. A technical limitation in the stack made it necessary to use this workaround.
This limitation has been removed by elastic#54998 (7.9). It is now possible to find out, if a node is a
transform node without using the node attribute. This PR deprecates the node attribute and bases
the placement decision on the node role. The attribute gets unused for placement, however we can
not remove the node attribute yet, because in a mixed version cluster the attribute might still
be used, especially when master gets updated last. The node attribute will be completely removed
in 8, as the upgrade path requires to upgrade to 7.last before upgrading to 8.x.
The query cache stats can currently report negative memory usage, which breaks REST responses for indices stats, node stats, etc, see elastic#55434 as one such example.

The reason why negative memory usage is reported is because of the following trappy calculation:

https://github.com/elastic/elasticsearch/blob/1de0b616ebb92de6060510e92269ef87fc6a8649/server/src/main/java/org/elasticsearch/indices/IndicesQueryCache.java#L98-L101

- weight can be `Double.POSITIVE_INFINITY` when `totalSize == 0` and `stats.size() == 0`
- if we then also have `sharedRamBytesUsed > 0`, then `additionalRamBytesUsed` will be `Long.MAX_VALUE`
- if you then sum up multiple of these `Long.MAX_VALUE` values (one for each shard), this leads to an overflow, resulting in negative numbers as seen in elastic#55434.

The reason sharedRamBytesUsed can be > 0 when there are no shard stats at all is because of the memory occupied by `LRUQueryCache.uniqueQueries`, where the lifetime can extend even closing of shards (where they are removed from shardStats).

Note that a workaround to the above bug is to [clear the cache](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-clearcache.html), as it makes `sharedRamBytesUsed == 0`, see https://github.com/elastic/elasticsearch/blob/1de0b616ebb92de6060510e92269ef87fc6a8649/server/src/main/java/org/elasticsearch/indices/IndicesQueryCache.java#L168-L174

Closes elastic#55434
This PR moves field related rollup metadata from the index metadata to the field mapping metadata.

- date_histogram fields is moved to the timestamp field 
     (fields are fixed_interval or calendar_interval, time_zone)
- histogram fields are moved to the numeric field on which the the histogram is computed 
     (the field is named interval)

Also, the index uuid has been added to the index rollup settings as index.rollup.source.uuid 
and index.rollup.source.name

The rest of the RollupMetadata has been removed and no rollup metadata exists in the global 
cluster state.

Co-authored-by: Tal Levy <[email protected]>
…x template api (elastic#70094)

Add support to delete composable index templates api to specify multiple template
names separated by a comma.

Change to cleanup template logic for rest tests to remove all composable index templates via a single delete composable index template request. This to optimize the cleanup logic. After each rest test we delete all templates. So deleting templates this via a single api call (and thus single cluster state update) saves a lot of time considering the number of rest tests.

If this pr is accepted then I will do the same change for the delete component template api.

Relates to elastic#69973
This fixes the case where a valid composable template overrides/shadows a
legacy template that defines the rollover alias. This is a valid configuration
and rollover should not fail.
We used to treat setting size to -1 in search request bodies or as a rest
parameter as a no-op, using the default search size of 10 in this case. This
lenient behaviour was deprecated in elastic#69548 and is removed with this PR in 8.0.

Relates to elastic#69548
Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think this looks good. Let's add tests and turn this into a regular pr?

@probakowski probakowski marked this pull request as ready for review March 18, 2021 12:24
@probakowski probakowski added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v7.13.0 v8.0.0 labels Mar 18, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Mar 18, 2021
@probakowski
Copy link
Contributor Author

@elasticmachine update branch

@probakowski
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/2

1 similar comment
@probakowski
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/2

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@probakowski probakowski mentioned this pull request Mar 23, 2021
15 tasks
@probakowski
Copy link
Contributor Author

@elasticmachine update branch

@probakowski probakowski merged commit f5b7aad into elastic:master Mar 23, 2021
@probakowski probakowski deleted the geoip-stats branch March 23, 2021 13:34
probakowski added a commit to probakowski/elasticsearch that referenced this pull request Mar 23, 2021
This change adds _geoip/stats endpoint that can be used to collect basic data about geoip downloader (successful, failed and skipped downloads, current db count and total time spent downloading).
It also fixes missing/wrong origins for clients that will break if used with security.

Relates to elastic#68920
probakowski added a commit that referenced this pull request Mar 23, 2021
* Add stats endpoint to GeoIpDownloader (#70282)

This change adds _geoip/stats endpoint that can be used to collect basic data about geoip downloader (successful, failed and skipped downloads, current db count and total time spent downloading).
It also fixes missing/wrong origins for clients that will break if used with security.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement Team:Data Management Meta label for data/management team v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants