Download the geoip databases only when needed #92335

masseyke · 2022-12-13T19:54:58Z

We currently download the geoip databases (used by the geoip processor) whether you need them or not. This PR changes it so that we only download them if you have at least one geoip processor in your cluster, or when you add a new geoip processor.
Closes #90673

elasticsearchmachine · 2022-12-13T19:55:22Z

Hi @masseyke, I've created a changelog YAML for you.

elasticsearchmachine · 2022-12-13T20:02:38Z

Hi @masseyke, I've updated the changelog YAML for you.

…b.com:masseyke/elasticsearch into fix/download-geoip-databases-only-when-needed

masseyke · 2022-12-14T15:31:52Z

@elasticmachine update branch

masseyke · 2023-01-05T19:23:00Z

@elasticmachine update branch

elasticsearchmachine · 2023-01-12T21:19:41Z

Pinging @elastic/es-data-management (Team:Data Management)

masseyke · 2023-01-24T14:00:35Z

@elasticmachine update branch

jbaiera

First pass, I think we've identified a couple of potential leaks and un/likely concurrency problems pertaining to the cluster service usage. Posting the comment review for now to get the ideas in writing. Will continue to review.

docs/reference/ingest/processors/geoip.asciidoc

jbaiera · 2023-01-24T22:28:19Z

...s/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloaderTaskExecutor.java

@@ -125,6 +125,8 @@ protected GeoIpDownloader createTask(
            parentTaskId,
            headers
        );
+        clusterService.addListener(downloader);


I think we should update the GeoIpDownloader to remove itself from the cluster listeners when it is cancelled.

jbaiera · 2023-01-24T22:38:27Z

modules/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloader.java

+        clusterService.getClusterSettings().addSettingsUpdateConsumer(EAGER_DOWNLOAD_SETTING, this::setEagerDownload);
        clusterService.getClusterSettings().addSettingsUpdateConsumer(POLL_INTERVAL_SETTING, this::setPollInterval);


We talked about this outside of the PR messaging, but we both think this could lead to some messy leaks.

Every time this task is allocated to a node, these settings listeners are being installed. There is no way to remove them and it doesn't look like they correctly check if the task is cancelled before rescheduling the background logic when they change. We should definitely protect against multiple downloaders running when updating either of these settings.

We could add a check to the update methods to make sure the task is active, but this still has the possibility of leaving a number of leaked task instances in the cluster service on any nodes that have previously cancelled them. We agreed that it would be better if we added a settings container of some sort that keeps an instance to the active downloader task on a node and calls it during updates if it exists. The settings container is light weight and can stick around without too much overhead since there will only ever be one of them, but the downloaders can come and go without leaking multiple update consumers.

The summary of another offline conversation:
It's just a bad idea to have these relatively-short-lived objects like GeoIpDownloader be settings or cluster state listeners because:

You have to be careful to un-listen when you're done so that you don't wind up with leaks

You have increased risk of race conditions since the condition you are listening for could happen in between the time the object is created and the time it is registered as a listener (not to mention when its predecessor is un-registered).

So I've moved all listeners for settings and cluster state changes into the executor. The task gets the current values for these things from the executor when it needs them, and the executor handles telling the task when it needs to reschedule because of a change in a dynamic property.

Co-authored-by: James Baiera <[email protected]>

…avoid race conditions

masseyke · 2023-01-26T22:08:31Z

@elasticmachine update branch

jbaiera

Everything is looking really great, thanks for iterating on the listener registration woes. We sync'd up offline, but there's definitely an issue with the processor detection logic. I think once that is squared away we're pretty much there!

jbaiera · 2023-01-27T19:33:33Z

modules/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/IngestGeoIpPlugin.java

@@ -85,8 +85,9 @@ public class IngestGeoIpPlugin extends Plugin implements IngestPlugin, SystemInd
    public List<Setting<?>> getSettings() {
        return Arrays.asList(
            CACHE_SIZE,
+            GeoIpDownloaderTaskExecutor.EAGER_DOWNLOAD_SETTING,


Nit: Can we tidy these up so that they are all ordered?

Sure. Done.

jbaiera · 2023-01-27T20:13:36Z

...s/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloaderTaskExecutor.java

        }
+
+        if (event.metadataChanged() && event.changedCustomMetadataSet().contains(IngestMetadata.TYPE)) {
+            boolean newAtLeastOneGeoipProcessor = hasAtLeastOneGeoipProcessor(event.state());


Hmm, it feels a little weird that we potentially calculate this a second time when bootstrapping. Probably not a performance bug but it does look a little funny. Especially since we calculate it with a different state call here than earlier in the same invocation.

I don't think it will usually happen twice will it? It happens on bootstrap, and then happens below only if the cluster state change includes an ingest metadata change (so only if someone modifies a pipeline definition), right?

Ah yes, I hadn't thought about that. Looks fine to me then.

jbaiera · 2023-01-27T20:20:03Z

...s/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloaderTaskExecutor.java

-        } else {
-            stopTask(() -> clusterService.addListener(this));
+        if (taskIsBootstrapped.getAndSet(true) == false) {
+            this.atLeastOneGeoipProcessor = hasAtLeastOneGeoipProcessor(clusterService.state());


Assuming that the comment about running this line twice on bootstrap isn't a big problem - should this be using event.state() instead of clusterService.state() for consistency? I don't think anything breaks as is, but my gut leans toward consistency.

Yeah might as well be consistent. Done.

jbaiera · 2023-01-27T20:20:48Z

...s/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloaderTaskExecutor.java

        }
+
+        if (event.metadataChanged() && event.changedCustomMetadataSet().contains(IngestMetadata.TYPE)) {


jbaiera · 2023-01-27T20:53:12Z

...s/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpDownloaderTaskExecutor.java

+            Map<String, Object> pipelineMap = pipelineDefinition.getConfigAsMap();
+            List<Map<String, Object>> processors = (List<Map<String, Object>>) pipelineMap.get(Pipeline.PROCESSORS_KEY);
+            if (processors != null) {
+                return processors.stream().anyMatch(processor -> processor.containsKey(GeoIpProcessor.TYPE));


Just realized that this might technically be too simple of a check. Annoyingly, processors can supply on_failure fields that contain processors, which can supply on_failure fields of their own, and I don't think we have any restriction on depth nor processor type allowed. The foreach processor also has a nested processor. Not sure if there are any others off the top of my head.

Wow good catch. I just changed this check to be recursive for those two things. I don't know of any others either.

jbaiera

Tests are looking pretty good though! Just some additional corrections for ya!

...est-geoip/src/internalClusterTest/java/org/elasticsearch/ingest/geoip/GeoIpDownloaderIT.java

Co-authored-by: James Baiera <[email protected]>

jbaiera

LGTM!

Thanks again for iterating on this! The changes look really great!

This commit changes the geoip downloader so that we only download the geoip databases if you have at least one geoip processor in your cluster, or when you add a new geoip processor (or if `ingest.geoip.downloader.eager.download` is explicitly set to true).

masseyke added 2 commits December 13, 2022 08:36

Only downloading geoip databases if geoip processors exist

d8169e5

fixing the logic

c57a55d

masseyke added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v8.7.0 labels Dec 13, 2022

Update docs/changelog/92335.yaml

19ff461

Update docs/changelog/92335.yaml

09a7247

masseyke added 5 commits December 13, 2022 15:51

fixing tests

ba93dcf

Merge branch 'fix/download-geoip-databases-only-when-needed' of githu…

ded8717

…b.com:masseyke/elasticsearch into fix/download-geoip-databases-only-when-needed

fixing tests

86fd456

fixing tests

565fec0

fixing tests

afaad92

elasticmachine and others added 3 commits December 14, 2022 10:31

Merge branch 'main' into fix/download-geoip-databases-only-when-needed

a9207c8

Allowing for eager download of geoip databases

8ea6e90

fixing integration test

8753b58

elasticmachine and others added 6 commits January 6, 2023 06:23

Merge branch 'main' into fix/download-geoip-databases-only-when-needed

b2ab307

minor changes

d87e105

Merge branch 'main' into fix/download-geoip-databases-only-when-needed

faa846c

attempting to work around geoip downloader bug

ef517b5

adding a comment

9ef49c5

cleanup

2abf42e

masseyke marked this pull request as ready for review January 12, 2023 21:19

masseyke requested a review from jbaiera January 12, 2023 21:19

elasticsearchmachine added the Team:Data Management Meta label for data/management team label Jan 12, 2023

Merge branch 'main' into fix/download-geoip-databases-only-when-needed

854026c

jbaiera reviewed Jan 24, 2023

View reviewed changes

masseyke and others added 4 commits January 25, 2023 13:12

Update docs/reference/ingest/processors/geoip.asciidoc

ecdc61b

Co-authored-by: James Baiera <[email protected]>

moving the handling of dynamic settings to a long-lived singleton to …

74ca199

…avoid race conditions

simplifying clusterChanged

087ea79

cleanup

ca42d50

masseyke requested a review from jbaiera January 26, 2023 16:01

masseyke added 2 commits January 26, 2023 12:47

removing code that is no longer needed

5b41713

moving cluster state update logic out of GeoIpDownloader

faa0ffd

Merge branch 'main' into fix/download-geoip-databases-only-when-needed

4579412

jbaiera requested changes Jan 27, 2023

View reviewed changes

jbaiera reviewed Jan 27, 2023

View reviewed changes

...est-geoip/src/internalClusterTest/java/org/elasticsearch/ingest/geoip/GeoIpDownloaderIT.java Outdated Show resolved Hide resolved

...est-geoip/src/internalClusterTest/java/org/elasticsearch/ingest/geoip/GeoIpDownloaderIT.java Outdated Show resolved Hide resolved

masseyke and others added 4 commits January 27, 2023 16:37

Recursively checking for geoip processors

af351c2

Apply suggestions from code review

4ec403c

Co-authored-by: James Baiera <[email protected]>

code review feedback

f7b551f

commenting a unit test

cb87a0a

masseyke requested a review from jbaiera January 27, 2023 22:56

jbaiera approved these changes Jan 30, 2023

View reviewed changes

masseyke merged commit 13b7190 into elastic:main Jan 30, 2023

masseyke deleted the fix/download-geoip-databases-only-when-needed branch January 30, 2023 19:07

masseyke mentioned this pull request Jan 30, 2023

Avoiding race conditions in GeoIpDownloaderIT #93363

Merged

masseyke mentioned this pull request Feb 2, 2023

Avoiding a NullPointerException in GeoIpDownloaderIT #93471

Merged

jbaiera mentioned this pull request May 25, 2023

GeoIp processor cannot verify custom database sourced from custom download service #96362

Open

This was referenced Jun 28, 2023

[Bug] Master branch is broken due to failing test testElasticSearch8Sink apache/pulsar#20661

Closed

Disable default geoip database downloading for Elasticsearch container testcontainers/testcontainers-java#7247

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Download the geoip databases only when needed #92335

Download the geoip databases only when needed #92335

masseyke commented Dec 13, 2022 •

edited

Loading

elasticsearchmachine commented Dec 13, 2022

elasticsearchmachine commented Dec 13, 2022

masseyke commented Dec 14, 2022

masseyke commented Jan 5, 2023

elasticsearchmachine commented Jan 12, 2023

masseyke commented Jan 24, 2023

jbaiera left a comment

jbaiera Jan 24, 2023

jbaiera Jan 24, 2023

masseyke Jan 26, 2023

masseyke commented Jan 26, 2023

jbaiera left a comment

jbaiera Jan 27, 2023

masseyke Jan 27, 2023

jbaiera Jan 27, 2023

masseyke Jan 27, 2023

jbaiera Jan 30, 2023

jbaiera Jan 27, 2023

masseyke Jan 27, 2023

jbaiera Jan 27, 2023

jbaiera Jan 27, 2023

masseyke Jan 27, 2023

jbaiera left a comment

jbaiera left a comment

		clusterService.getClusterSettings().addSettingsUpdateConsumer(EAGER_DOWNLOAD_SETTING, this::setEagerDownload);
		clusterService.getClusterSettings().addSettingsUpdateConsumer(POLL_INTERVAL_SETTING, this::setPollInterval);

		}

		if (event.metadataChanged() && event.changedCustomMetadataSet().contains(IngestMetadata.TYPE)) {

Download the geoip databases only when needed #92335

Download the geoip databases only when needed #92335

Conversation

masseyke commented Dec 13, 2022 • edited Loading

elasticsearchmachine commented Dec 13, 2022

elasticsearchmachine commented Dec 13, 2022

masseyke commented Dec 14, 2022

masseyke commented Jan 5, 2023

elasticsearchmachine commented Jan 12, 2023

masseyke commented Jan 24, 2023

jbaiera left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masseyke commented Jan 26, 2023

jbaiera left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbaiera left a comment

Choose a reason for hiding this comment

jbaiera left a comment

Choose a reason for hiding this comment

masseyke commented Dec 13, 2022 •

edited

Loading