-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discuss] alerting event log: stand-alone ES index or SavedObjects? #51223
Comments
Pinging @elastic/kibana-stack-services (Team:Stack Services) |
There is another option which we haven't discussed yet, which is essentially what ML is planning to do: treat the event-log entries as "linked" to the alert and action saved-objects. This conceptually enforces the security and spaces constrains of the alert and action saved-objects upon the related event-log entries. If each event-log entry stores a reference to the associated alert or action, then we can create a dedicated API endpoint to return the associated event-log entries. The following is the pseudo-code for what this API endpoint would look like:
This approach becomes much more complicated if we intend for the user to be able to access event-log entries for all alerts that the user has access to, but as long as we're primarily concerned with the event-log entries for a single alert, it has its benefits. |
We will need to support search - in fact that's the primary use case - "show me what alerts and actions were doing in this time frame" - the other use case being add a log entry. I don't think this story would work out well for search, but maybe I'm missing something.
|
After looking into this, I think the approach @kobelb proposes makes sense from my perspective as well. The approach includes that we also don't use saved objects and we keep it as a stand alone index / indices and manage it ourselves. I don't think we'll be able to support displaying the entire activity log of all the alerts at all unless the user has access to everything. This problem applies regardless if we're using saved objects or not. The reason behind that is we would need to apply inherited access at query time which is not something we have features today to help us do so in a performant manner and at basic+ license tiers (after talking with @kobelb). Though we will still be able to filter down the activity log to a specific alert and display everything logged for it. For the migrations, I agree that since we're using ECS, we won't have to change field mappings, only add new ones as we log more information. Using templates would basically just apply the new mappings to new indices only which is fine as they would be the ones getting the updated data. For reference how task manager used templates: https://github.com/elastic/kibana/blob/7.2/x-pack/plugins/task_manager/task_store.ts#L119-L216. I want to cc @epixa and @peterschretlen to make sure we're making the right decision here. |
I support that approach. To be frank, I don't see how we could ever possibly scale saved objects to support the amount of data you're talking about here. Even if it were desirable to use them for this, I think we'd need to make a more practical decision anyway. This will mean we have certain limitations when working with the data, it just is what it is. If people are going to use this data in SIEM, Logging, dashboard, canvas, etc., then they're going to need to query directly on the raw data anyway. We don't really get any of the benefits of ECS if we're not treating this stuff like data for these more advanced used cases anyway. |
I think the linking via API covers a lot of the common use cases. Restricting to a single alert or action actually works well with some of @mdefazio recent work that has the activity showing in a flyout. @pmuellr makes a valid point about search, ideally the activity log view will allow search and filtering. Could document level security be used in that case to enforce space and app access controls? That would restrict a full event log UI to platinum, which I think we can accept if that's the only way to do it ( @alexfrancoeur? ). Users would still be able to fall back on explicitly granting index privileges to the activity log, with the caveat that there are no access controls. |
This is likely a tangential topic, but are there specific aspects of saved-objects which you anticipate not scaling?
I think it's worth exploring. My greatest concern is the level of effort to set this up...
This is a rather new requirement which I've just started hearing discussed more and more... I feel like we're verging on a new feature/requirement here to really give the users what they're asking for. Ideally, we'd allow the users to view this data in Kibana without end-users needing Elasticsearch privileges to the entire data indices, and it'd respect their Kibana privileges. For the time being, we have to choose between using an internal/system index which end-users can't access directly and we apply our own authorization rules; and a data index where end-users can access all of the data. Neither of which is ideal. |
Perhaps we can just treat the "need to search across all events" as a new type of permission required, which wouldn't be needed by most mortal users, but useful for admin-types of people. Or could be a permission that could be applied in cases where the customers don't care about data leaking through space/feature control aspects (probably the case for a lot of users). But still have it "locked down" by default, for those customers that care. |
Also worth mentioning that in the latest version of the event log code, I had added a This seems to jive perfectly with the current thinking here (except in Brandon's comment, the query term would be Since in theory the event log may need to reference saved objects not in |
Consensus seems to be to go with the "linked SO" approach Brandon mentioned in #51223 (comment) . The current event log PR #45081 contains some initial support for storing saved object ids in the event log documents, to allow search for specific alerts / actions to get their history. We'll plan on not exposing a general search facility via HTTP, and forcing the use of a saved object id when searching via the plugin-provided event log service. We can expand the search capabilities later, as needed, when we can do it in a secure manner. |
For the alerting event log, we're going to need to have a persistent log of alerting and action events, for both UI and general introspection. We've already settled on using ES to maintain this log, where every document will be a log entry, with an ECS-compatible shape.
Now the question is, do we roll our own index in ES, or do we use SavedObjects (SO's) to manage the index.
Some additional context:
The first two notes (space-specific and security) are either already handled by SO's, or are known issues (sub-features to handle the user A/app X, user B/app Y case), so we get that support for free when we use SO's. Big win here, because if we roll our own index, we'll have to replicate all this.
The remaining notes are the problem/risk areas for using SO's.
The current implementation of the event log rolls it's own index with ILM support. But let's go with the assumption we want to use SO's, and see what we need to "fix" to make that happen.
current event log ES resource creation
At start up time, the event log code goes through the following process:
The index template is pretty key here, as ILM will be creating new indices per roll-over settings, and the new index settings/mappings/etc will all be coming from there. The alias is also set in the template, and ILM does some management of that as well (dealing with the write_index, etc).
After the start up time processing has completed, operations against the event log will run against the alias - basically appending new log entries (via index) and searching.
current support for self-named saved object indices
Today you can create your own saved object index, with your own name, via the existing saved object APIs. The event log as a saved object would end up using this mechanism to create a new saved-object index specifically for the event log.
I wrote some code, stolen from the Task Manager code that also creates it's own SO index, to see what ends up getting created for such indexes.
Say I've specified the
indexPattern
to use in thesavedObjectSchema
to be"event_log"
(not the real name!). After startup, there will be an index created namedevent_log_1
and an alias set up ofevent_log
. So we've already got a problem there - SO suffixing the alias name with a_1
for migration reasons will be a problem since ILM wants a differently shaped name (event_log_0000001
or such). But there's a semantic problem as well, since:Let's also remember that we want an index template, so when ILM rolls over an index, the new index gets created the way we want. Presumably that index template would be created before the saved object index was created, and so it's not clear what effect that might have on the initial index creation.
suggested approach
It seems like the ideal situation in my mind would be to have the event log plugin continue to create the ES resources it needs, and then created a saved object store that referenced the resulting alias for document CRUD operations. But somehow have the saved object library NOT do any of the actual index management / CRUD operations, but still do support all the document CRUD operations (currently just index and search).
At a high level, this seems like another option we could add to
savedObjectSchemas
, via a new property likeselfManaged
below:Imagine that nullifies all the work saved objects does regarding creating indices, aliases, migration - index level operations, compared to document level operations. Then we could keep the existing event log ES resource creation code as is, and then create the saved object store which would give us all the document level access. More on migration below.
As a further point to make this work, the index template contains the mappings, so we'd need a way to get the existing "envelope" mappings that saved objects already adds. Here's what it currently looks like:
saved objects "envelope" mappings
Note that currently the plan is to only have one saved object "type" - to facilitate searching across all the log data based on ECS properties.
Assuming we also want to opt-out of migration, some of these properties may not be needed. But presumably, we'd need a lot of the other fields for saved objects to operate correctly, and so would need a way to get that mapping from the saved object framework, so we could write it into the mappings in our index template.
what about migration
There are two aspects to migration
The first I think we won't have to deal with, short term. We're currently looking at using a subset of ECS as the mappings for the event log data, with some extensions specific to the event log itself. The subset is pretty small (~10 properties), but I expect we will be adding more over time, as clients want to add more. Most properties will be optional anyway, at an API level. Adding more, as "possibly null" values, isn't a real migration concern.
In addition, we wouldn't want to do a real migration of the event log data anyway, since in theory there could be years worth of it, and there would have to be a pretty hard requirement (and associated hard work) to make that happen.
The more concerning one is when the mappings for the SO "envelope" change. Current thinking is that this could change with SO's that can be shared across spaces (a possible eature in the future), but we don't know what that shape change would be. What happens when the SO envelope mappings change across releases.
We'll need to figure out this story, but for the initial releases of event log, we may have to live with a story (worse case!) that for new releases, old event logs may no longer be searchable. Somehow we'd need to identify that the SO envelope mappings have changed, update the index template with those changes, and do a rollover, before writing new entries.
Perhaps removing the alias from the older indices. Or something. Depending on the change, older indices may be searchable, or maybe not.
In the same vein of issues, imagine SO changes to use document- or field-level security for SO's. If any of that would depend on special index-specific settings, this could be trouble-some to deal with.
I'll admit I'm still a n00b on a lot of the Kibana (Saved Object) and ES (ILM, security) aspects of this, so maybe some of these "issues" are either non-issues, or show-stoppers.
Please chime in ...
The text was updated successfully, but these errors were encountered: