Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor search source warnings to return a single warning for 'is_partial' results #164905

Closed
Tracked by #164350
nreese opened this issue Aug 25, 2023 · 3 comments · Fixed by #165512
Closed
Tracked by #164350

refactor search source warnings to return a single warning for 'is_partial' results #164905

nreese opened this issue Aug 25, 2023 · 3 comments · Fixed by #165512
Assignees
Labels
discuss impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. loe:needs-research This issue requires some research before it can be worked on or estimated Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. Team:SharedUX Team label for AppEx-SharedUX (formerly Global Experience) technical debt Improvement of the software architecture and operational architecture

Comments

@nreese
Copy link
Contributor

nreese commented Aug 25, 2023

extractWarnings creates an array of warnings from an elasticsearch _search response. The current implementation creates a warning per shard error.

This implementation creates the following problems:

  1. shard warning contains multiple levels of information
    a) shard warning contains state about an instance of one shard failure. This is ok.
    b) shard warning message field contains messaging about all shard failures. For example "{shardsFailed} of {shardsTotal} shards failed". This is bad and is a code smell that signals that the data modeling is not a good fit for the use case.
  2. consumers must de-duplicate warnings. Having a shard warning per each shard event requires the consumer to de-duplicate warnings to avoid spaming users with lots of events for the same thing. Notice how in the screen shots below, lens and discover use different de-deplicating algorithms and present different number of warnings to users
    a) lens
    Screen Shot 2023-08-25 at 5 26 20 PM
    b) discover
    Screen Shot 2023-08-25 at 5 23 50 PM
  3. Warning messaging leads with implemenation details, "2 or 3 shards failed" instead of leading with core concern for users "The data might be incomplete or wrong"
  4. current solution does not scale well when adding more reasons for "incomplete data". For example, data can be incomplete when remote clusters are skipped.

I propose a different data model. Instead of producing a warning per shard error, produce a single warning for "incomplete data". This single warning can contain details of shard failures, shard time outs, and skipped clusters. On the UI side, the user will see a single warning saying "The data might be incomplete or wrong". As a secondary part, the the warning details will show "{shardsFailed} of {shardsTotal} shards failed", "{shardsTimedout} of {shardsTotal} shards timed out", and "{clustersFailed} of {clustersTotal} clusters failed". Allowing users to explore the detailed reasons why their is incomplete data.

@nreese nreese added the bug Fixes for quality problems that affect the customer experience label Aug 25, 2023
@botelastic botelastic bot added the needs-team Issues missing a team label label Aug 25, 2023
@nreese nreese added the Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. label Aug 25, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Aug 25, 2023
@nreese nreese added discuss technical debt Improvement of the software architecture and operational architecture needs-team Issues missing a team label and removed bug Fixes for quality problems that affect the customer experience labels Aug 25, 2023
@botelastic botelastic bot removed the needs-team Issues missing a team label label Aug 25, 2023
@nreese nreese added the Team:SharedUX Team label for AppEx-SharedUX (formerly Global Experience) label Aug 25, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/appex-sharedux (Team:SharedUX)

@mattkime
Copy link
Contributor

Certainly makes sense to me. I'm curious what my teammates think of this. Might be a good topic for next tuesday's team meeting.

@nreese nreese self-assigned this Sep 6, 2023
@davismcphee davismcphee added impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. loe:needs-research This issue requires some research before it can be worked on or estimated labels Sep 8, 2023
nreese added a commit that referenced this issue Sep 14, 2023
…rtial' results (#165512)

Closes #164905

This PR replaces individual shard failure and timeout warnings with a
single "incomplete data" warning. This work is required for
#163381

<img width="500" alt="Screen Shot 2023-09-06 at 9 35 52 AM"
src="https://github.com/elastic/kibana/assets/373691/77e62792-c1f1-4780-b4f2-3aca24e4691b">

<img width="500" alt="Screen Shot 2023-09-06 at 9 36 00 AM"
src="https://github.com/elastic/kibana/assets/373691/56f37db1-2b4a-484b-9244-66b352d82dc1">

<img width="500" alt="Screen Shot 2023-09-06 at 9 36 07 AM"
src="https://github.com/elastic/kibana/assets/373691/4a777963-6e88-4736-9d63-99a2843ebdbb">

### Test instructions
* Install flights and web logs sample data
* Create data view kibana_sample_data*. **Set time field to timestamp**
* open discover and select kibana_sample_data* data view
* Add filter with custom DSL
    ```
    {
      "error_query": {
        "indices": [
          {
            "error_type": "exception",
            "message": "local shard failure message 123",
            "name": "kibana_sample_data_logs",
            "shard_ids": [
              0
            ]
          }
        ]
      }
    }
    ```

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Julia Rechkunova <[email protected]>
Co-authored-by: Marco Liberati <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. loe:needs-research This issue requires some research before it can be worked on or estimated Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. Team:SharedUX Team label for AppEx-SharedUX (formerly Global Experience) technical debt Improvement of the software architecture and operational architecture
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants