-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of cross cluster search errors #163381
Comments
Pinging @elastic/kibana-data-discovery (Team:DataDiscovery) |
Handling CCS errors will require removing isErrorResponse originally added by #78006. In that review there is a conversation that isPartial check needed to prevent problem in security solution. I am tracking down an answer to see if removing Also, it's a bug in elasticsearch that allows Kibana to show partial results when local cluster has shard errors. I opened elastic/elasticsearch#98725 and this should be resolved in 8.11, so isPartial check has to be removed from isErrorResponse |
…rtial' results (#165512) Closes #164905 This PR replaces individual shard failure and timeout warnings with a single "incomplete data" warning. This work is required for #163381 <img width="500" alt="Screen Shot 2023-09-06 at 9 35 52 AM" src="https://github.com/elastic/kibana/assets/373691/77e62792-c1f1-4780-b4f2-3aca24e4691b"> <img width="500" alt="Screen Shot 2023-09-06 at 9 36 00 AM" src="https://github.com/elastic/kibana/assets/373691/56f37db1-2b4a-484b-9244-66b352d82dc1"> <img width="500" alt="Screen Shot 2023-09-06 at 9 36 07 AM" src="https://github.com/elastic/kibana/assets/373691/4a777963-6e88-4736-9d63-99a2843ebdbb"> ### Test instructions * Install flights and web logs sample data * Create data view kibana_sample_data*. **Set time field to timestamp** * open discover and select kibana_sample_data* data view * Add filter with custom DSL ``` { "error_query": { "indices": [ { "error_type": "exception", "message": "local shard failure message 123", "name": "kibana_sample_data_logs", "shard_ids": [ 0 ] } ] } } ``` --------- Co-authored-by: kibanamachine <[email protected]> Co-authored-by: Julia Rechkunova <[email protected]> Co-authored-by: Marco Liberati <[email protected]>
Closes #166021 Closes #163381 This PR adds inspector cluster tab MVP This PR does not: 1) include all UI elements in design. These will be added at later dates. 2) show clusters tab when request fails. Somewhere between kibana server elasticsearch request and the client, the raw response is getting removed for failed requests. This will have to be sorted out in a separate PR. 3) Opening clusters tab from "incomplete data" warnings ### Test setup 1. Start remote elasticsearch by running: `yarn es snapshot -E transport.port=9500 -E http.port=9201 -E path.data=../remote1` 2. Install sample data to remote cluster 1. Add `elasticsearch.hosts: ["http://localhost:9201"]` to kibana.dev.yml. **Note** create `config/kibana.dev.yml` if one does not exist. kibana.dev.yml is not managed by git so it has to be created the first time you add values. 2. run `yarn start` to start kibana process 3. install sample web logs data set on home page 4. install sample flight data set on home page 5. stop kibana process 6. remove `elasticsearch.hosts` from kibana.dev.yml 3. Start local elasticsearch by running: `yarn es snapshot -E path.data=../local1` 4. Start kibana 5. Add remote cluster under "Stack management -> Remote clusters" 1. Set **Name** to "remote1" 2. Set **Seed nodes** to "localhost:9500" 3. Enable **Skip if unavailable** 5. install sample web logs data set 6. install sample flights data set 7. Create data view. 1. Set **Index pattern** to `kibana_sample_data*,remote1:kibana_sample_data*` 2. Set **Time field** to `timestamp` ### Local cluster (status=successful) 1) Open discover 2) Select "Kibana sample data logs" data view 3) Open inspector 4) Open clusters tab <img width="300" alt="Screen Shot 2023-09-22 at 9 38 38 AM" src="https://github.com/elastic/kibana/assets/373691/e4e91555-8200-43bc-b2fe-7739f7178e43"> ### Remote cluster (status=successful) 1) Open discover 2) Select "kibana_sample_data*,remote1:kibana_sample_data*" data view 3) Open inspector 4) Open clusters tab <img width="300" alt="Screen Shot 2023-09-22 at 9 47 08 AM" src="https://github.com/elastic/kibana/assets/373691/676897fc-e7e2-4c0b-8e35-c382c6ac89d6"> ### Remote cluster (status=partial, failed shard) 1) Open discover 2) Select "kibana_sample_data*,remote1:kibana_sample_data*" data view 3) Add filter ``` { "error_query": { "indices": [ { "error_type": "exception", "message": "local shard failure message 123", "name": "remote1:kibana_sample_data_logs", "shard_ids": [ 0 ] } ] } } ``` 3) Open inspector 4) Open clusters tab <img width="300" alt="Screen Shot 2023-09-22 at 9 50 49 AM" src="https://github.com/elastic/kibana/assets/373691/6935f2b4-60ad-4704-8ee0-17890ca9d83a"> <img width="300" alt="Screen Shot 2023-09-22 at 9 51 12 AM" src="https://github.com/elastic/kibana/assets/373691/ec0a6b4a-177f-40fd-96b3-c56102d5f425"> ### Remote cluster (status=skipped, all shards fail) 1) Open discover 2) Select "kibana_sample_data*,remote1:kibana_sample_data*" data view 3) Add filter ``` { "error_query": { "indices": [ { "error_type": "exception", "message": "local shard failure message 123", "name": "remote1:*", "shard_ids": [ 0 ] } ] } } ``` 3) Open inspector 4) Open clusters tab <img width="300" alt="Screen Shot 2023-09-22 at 9 52 49 AM" src="https://github.com/elastic/kibana/assets/373691/a1ba947b-3cd1-4416-9756-29f3960a4ba6"> ### Remote cluster (status=skipped, no remote) 1) Open discover 2) Kill process running remote1 elasticsearch 3) Select "kibana_sample_data*,remote1:kibana_sample_data*" data view 4) Open inspector 5) Open clusters tab <img width="300" alt="Screen Shot 2023-09-22 at 9 55 45 AM" src="https://github.com/elastic/kibana/assets/373691/f049b617-96e1-4ecc-bfeb-f75522f70fef"> --------- Co-authored-by: kibanamachine <[email protected]>
With the merge of elastic/elasticsearch#97731 we now have additional error details for CCS searches available which we should surface to the user. It should be handled similar to shard warnings, which were recently improved in #161271
More details about the error states of a CCS search can be found here:
elastic/elasticsearch#97731 (comment)
Generally, unless all clusters fail, we should treat the resulting message of failures on single clusters as warning, providing details to the user on demand, like shard errors
That's how a shard error looks like in Discover:
data:image/s3,"s3://crabby-images/1abef/1abef8947e2dfa181a9984207d6af2bfca41e5f0" alt=""
The wording could be something like
1 of 3 cross cluster searches failed. The data might be incomplete or wrong . Show details.
When the user clicks on "Show details", like in the shard error, deeper insights on the failures would be provided:
tbc
The text was updated successfully, but these errors were encountered: