Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow excluding a cluster from a CCS search using multi-target syntax #97865

Merged
merged 5 commits into from
Jul 31, 2023

Conversation

quux00
Copy link
Contributor

@quux00 quux00 commented Jul 21, 2023

The multi-target syntax for indices allows excluding an index with the "-" sign. Public docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/api-conventions.html#api-multi-index

This commit expands that functionality to index expressions that include a cluster alias
for purposes of excluding an entire cluster from a cross-cluster search.

For example:
POST logs*,*:logs*,-remote4:*,-remote1*:*/_async_search
would result in searching all remote clusters except for remote4 and remote1, remote11, remote12, remote13, etc..

A singleton wildcard is required in the index position of the cluster:index index expression, to specify that we are excluding the entire cluster. This is useful when a cluster is down or slow during CCS searches.

Excluding a subset of indexes on a remote cluster is not supported in this commit. For example, this will throw an error:
POST logs*,*:logs*,-remote4:logs*/_async_search

@quux00 quux00 added >enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.10.0 labels Jul 21, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine
Copy link
Collaborator

Hi @quux00, I've created a changelog YAML for you.

@quux00 quux00 marked this pull request as draft July 21, 2023 16:39
@quux00
Copy link
Contributor Author

quux00 commented Jul 21, 2023

One issue I have found is that if you negate all the remote indices you specified, rather than searching nothing or returning an error, it searches the local indexes, even when the local cluster was explicitly not included.

Example
Query:

GET remote*:*,-remote*:*/_search?ccs_minimize_roundtrips=false  // NOTE: MRT=true has the same result
{
  "size": 0,
  "aggs": {
    "indexgroup": {
      "terms": {
        "field": "_index"
      }
    }
  }
}

returns a response that shows all local indexes were searched:

{
  "took": 19,
  "timed_out": false,
  "_shards": {
    "total": 7,
    "successful": 7,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1204,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "indexgroup": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "blogs",
          "doc_count": 1200
        },
        {
          "key": "myidx",
          "doc_count": 1
        },
        {
          "key": "test1",
          "doc_count": 1
        },
        {
          "key": "test2",
          "doc_count": 1
        },
        {
          "key": "test3",
          "doc_count": 1
        }
      ]
    }
  }
}

This is a quite surprising and unexpected result, which I assumed was a bug but upon reading the code it seems to be intentional, given this code in RemoteClusterService

    public Map<String, OriginalIndices> groupIndices(IndicesOptions indicesOptions, String[] indices) {
        final Map<String, OriginalIndices> originalIndicesMap = new HashMap<>();
        final Map<String, List<String>> groupedIndices = groupClusterIndices(getRemoteClusterNames(), indices);
        if (groupedIndices.isEmpty()) {
            // search on _all in the local cluster if neither local indices nor remote indices were specified
            originalIndicesMap.put(LOCAL_CLUSTER_GROUP_KEY, new OriginalIndices(Strings.EMPTY_ARRAY, indicesOptions));
        } else {
            for (Map.Entry<String, List<String>> entry : groupedIndices.entrySet()) {
                String clusterAlias = entry.getKey();
                List<String> originalIndices = entry.getValue();
                originalIndicesMap.put(clusterAlias, new OriginalIndices(originalIndices.toArray(new String[0]), indicesOptions));
            }
        }
        return originalIndicesMap;
    }

In this case groupedIndices is returned as empty because the remote clusters index expressions cancelled each other out. Why would we want to search local indexes when the user explicitly did not include the local cluster in this cross-cluster search?

That code was added in this PR #33899, which states:

When executing a cross-cluster search, we need to search against all local indices (and no remote indices) in case no indices are specified.

Is that still true? If yes, why? And does that apply to a situation where the user (likely by mistake) both included and then excluded the same remote clusters? Should we instead throw an error in that case?

quux00 added 3 commits July 26, 2023 08:40
The multi-target syntax for indices allows excluding an index with the "-" sign.
Public docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/api-conventions.html#api-multi-index

This commit expands that functionality to index expressions that include a cluster alias.

For example:
POST logs*,*:logs*,-remote4:*,-remote1*:*/_async_search

Would result in search all remote clusters except for remote4 and remote1, remote11, remote12, remote13, etc..

A singleton wildcard is required in the index position of the `cluster:index`,
to specify that we are excluding the entire cluster. This is useful when a cluster
is down or slow during CCS searches.

Excluding a subset of indexes on a remote cluster is not supported in this commit.
For example, this will throw an error:
POST logs*,*:logs*,-remote4:logs*/_async_search
@quux00
Copy link
Contributor Author

quux00 commented Jul 26, 2023

I have modified RemoteClusterAware.groupClusterIndices to check whether an exclusion list excluded all clusters and if so, throw an error.

Example:

GET remote*:*,-remote*:*/_search
{
  "size": 2,
  "query": {
    "wildcard": {
      "url": "/blog/f*"
    }
  },
  "aggs": {
    "indexgroup": {
      "terms": {
        "field": "_index"
      }
    }
  }
}
{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "The '-' exclusions in the index expression list excludes all indexes. Nothing to search. Input: remote*:*,-remote*:*"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "The '-' exclusions in the index expression list excludes all indexes. Nothing to search. Input: remote*:*,-remote*:*"
  },
  "status": 400
}

but this will not throw an exception because the local indices were not excluded:

GET *,remote*:*,-remote*:*/_search

Note, however, that my code change only handles remote clusters - this will not throw an exception:

GET blogs*,-blogs*,remote1:*,-remote1:*/_search?ccs_minimize_roundtrips=false

but will instead return empty search results:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 0,
    "successful": 0,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": 0,
    "hits": []
  }
}

The local index resolution happens in a different place than the remote cluster resolution and is quite intricate and complex, so I decided to accept the difference.

The key issue here is that we need a way to exclude a cluster from a CCS search such that it is never contacted (due to being unavailable for example and causes the entire search request to fail).

@quux00 quux00 force-pushed the ccs/exclude-cluster branch from 007d5d1 to c03d07c Compare July 26, 2023 13:08
`remote*:*,-remote1:*,-remote4:*` will search all clusters with an alias that starts
with "remote" except for "remote1" and "remote4". Note that to exclude a cluster
with this notation you must exclude all of its indexes. Excluding a subset of indexes
on a remote cluster is currently not supported. For example, this will throw an exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excluding a subset of indexes is supported but not quite this way? You can do remote1:-index1?

Do we reject remote*:abc,-remote1:abc? I am wondering what makes more sense, and also whether this would be perceived as a limitation from Kibana. I guess more of a convention: "if you want to exclude the cluster, use wildcard for indices".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do remote1:-index1?

Hmm, I haven't tried that. Good question. I'll look into it.

Do we reject remote*:abc,-remote1:abc? I am wondering what makes more sense

Yes, the current code rejects that.

I decided not to try to tackle that because it is a slippery slope. At this point in the code where it is handled, we don't have any idea what indexes exist on remote clusters. My goal here was to just exclude entire clusters. For your example, it would be fairly straightforward (though more work) to determine that it is a full cluster exclusion. But then a user might expect they can do remote*:abc*,-remote1:abc too. But to truly handle that, we'd have to change the sub-search section of TransportSearchAction to send a different request to remote1 then we send to all the other clusters. And the CCS code pathways are already complicated enough that I didn't think that was worth it.

I guess more of a convention: "if you want to exclude the cluster, use wildcard for indices".

Right, my view is just to have a rule "you can exclude the entire cluster" here by using * in the index position.
So we could allow remote*:abc,-remote1:abc which is a negation of concrete indexes, but then we have to explain to users why any wildcard other than just remote1:* is not allowed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your thinking is pragmatic, let's keep it this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do remote1:-index1?

Thanks for pointing that out. I tested it and it does work in the same way the it does for local indexes. Which has an annoying quirk that you have to use a wildcard in the "inclusion" list to exclude a concrete index.

Example:

This works to exclude blogs:

POST web_traffic,b*,-blogs/_async_search

but this does not:

POST web_traffic,blogs,-blogs/_async_search

// fails with
  "error": {
    "type": "status_exception",
    "reason": "error while executing search",
    "caused_by": {
      "type": "index_not_found_exception",
      "reason": "no such index [-blogs] and if you intended to exclude this index, ensure that you use wildcards that include it before explicitly excluding it",
      "resource.type": "index_or_alias",
      "resource.id": "-blogs",
      "index_uuid": "_na_",
      "index": "-blogs"
    }
  }

It used to just say "no such index [-blogs]" which is very misleading. I recently added the extra note "if you intended to exclude this index, ensure that you use wildcards that include it before explicitly excluding it".

So similarly, for remote clusters, this works:

POST remote*:blog*,remote1:-blogs/_async_search

but this fails with the original misleading error message:

POST remote*:blogs,remote1:-blogs/_async_search

  "error": {
    "type": "status_exception",
    "reason": "error while executing search",
    "caused_by": {
      "type": "index_not_found_exception",
      "reason": "no such index [-blogs]",
      "index_uuid": "_na_",
      "resource.type": "index_or_alias",
      "resource.id": "-blogs",
      "index": "-blogs"
    }
  }

so I'll file a bug to improve that error message.

@quux00 quux00 merged commit 50b1749 into elastic:main Jul 31, 2023
quux00 added a commit to quux00/elasticsearch that referenced this pull request Aug 17, 2023
quux00 added a commit to quux00/elasticsearch that referenced this pull request Aug 17, 2023
quux00 added a commit that referenced this pull request Aug 21, 2023
@quux00
Copy link
Contributor Author

quux00 commented Aug 21, 2023

Release highlight was added to 8.10 branch with PR #98620

nreese added a commit to elastic/kibana that referenced this pull request Aug 31, 2023
elastic/elasticsearch#97865 expands
index-pattern expressions to include a cluster alias for purposes of
excluding an entire cluster from a cross-cluster search. This allows
users to put the minus sign in front of the cluster name
(`-cluster_one:*`). The advantage to this change is that it avoids
sending any network calls to that cluster. Compare this to the existing
syntax for excluding clusters, where the minus sign is in front of the
index name (`cluster_one:-*`). The older syntax has to send the request
to the remote cluster, which if it is down (and skip_unavailable=false),
will cause the search to fail.

This PR updates the docs to reflect the new syntax.

---------

Co-authored-by: Kibana Machine <[email protected]>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Aug 31, 2023
elastic/elasticsearch#97865 expands
index-pattern expressions to include a cluster alias for purposes of
excluding an entire cluster from a cross-cluster search. This allows
users to put the minus sign in front of the cluster name
(`-cluster_one:*`). The advantage to this change is that it avoids
sending any network calls to that cluster. Compare this to the existing
syntax for excluding clusters, where the minus sign is in front of the
index name (`cluster_one:-*`). The older syntax has to send the request
to the remote cluster, which if it is down (and skip_unavailable=false),
will cause the search to fail.

This PR updates the docs to reflect the new syntax.

---------

Co-authored-by: Kibana Machine <[email protected]>
(cherry picked from commit 4b02740)
kibanamachine added a commit to elastic/kibana that referenced this pull request Aug 31, 2023
# Backport

This will backport the following commits from `main` to `8.10`:
- [update data view docs for excluding cluster
(#164904)](#164904)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Nathan
Reese","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-08-31T14:27:03Z","message":"update
data view docs for excluding cluster
(#164904)\n\nhttps://github.com/elastic/elasticsearch/pull/97865
expands\r\nindex-pattern expressions to include a cluster alias for
purposes of\r\nexcluding an entire cluster from a cross-cluster search.
This allows\r\nusers to put the minus sign in front of the cluster
name\r\n(`-cluster_one:*`). The advantage to this change is that it
avoids\r\nsending any network calls to that cluster. Compare this to the
existing\r\nsyntax for excluding clusters, where the minus sign is in
front of the\r\nindex name (`cluster_one:-*`). The older syntax has to
send the request\r\nto the remote cluster, which if it is down (and
skip_unavailable=false),\r\nwill cause the search to fail.\r\n\r\nThis
PR updates the docs to reflect the new
syntax.\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine
<[email protected]>","sha":"4b02740b32542b6e84ee7a8abec9cb355f70db35","branchLabelMapping":{"^v8.11.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Docs","release_note:skip","v8.10.0","v8.11.0"],"number":164904,"url":"https://github.com/elastic/kibana/pull/164904","mergeCommit":{"message":"update
data view docs for excluding cluster
(#164904)\n\nhttps://github.com/elastic/elasticsearch/pull/97865
expands\r\nindex-pattern expressions to include a cluster alias for
purposes of\r\nexcluding an entire cluster from a cross-cluster search.
This allows\r\nusers to put the minus sign in front of the cluster
name\r\n(`-cluster_one:*`). The advantage to this change is that it
avoids\r\nsending any network calls to that cluster. Compare this to the
existing\r\nsyntax for excluding clusters, where the minus sign is in
front of the\r\nindex name (`cluster_one:-*`). The older syntax has to
send the request\r\nto the remote cluster, which if it is down (and
skip_unavailable=false),\r\nwill cause the search to fail.\r\n\r\nThis
PR updates the docs to reflect the new
syntax.\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine
<[email protected]>","sha":"4b02740b32542b6e84ee7a8abec9cb355f70db35"}},"sourceBranch":"main","suggestedTargetBranches":["8.10"],"targetPullRequestStates":[{"branch":"8.10","label":"v8.10.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.11.0","labelRegex":"^v8.11.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/164904","number":164904,"mergeCommit":{"message":"update
data view docs for excluding cluster
(#164904)\n\nhttps://github.com/elastic/elasticsearch/pull/97865
expands\r\nindex-pattern expressions to include a cluster alias for
purposes of\r\nexcluding an entire cluster from a cross-cluster search.
This allows\r\nusers to put the minus sign in front of the cluster
name\r\n(`-cluster_one:*`). The advantage to this change is that it
avoids\r\nsending any network calls to that cluster. Compare this to the
existing\r\nsyntax for excluding clusters, where the minus sign is in
front of the\r\nindex name (`cluster_one:-*`). The older syntax has to
send the request\r\nto the remote cluster, which if it is down (and
skip_unavailable=false),\r\nwill cause the search to fail.\r\n\r\nThis
PR updates the docs to reflect the new
syntax.\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine
<[email protected]>","sha":"4b02740b32542b6e84ee7a8abec9cb355f70db35"}}]}]
BACKPORT-->

Co-authored-by: Nathan Reese <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement release highlight :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants