-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] OpenSearch Events Correlation Engine #6779
Comments
@sbcd90 Thanks for creating the RFC. This is a great start. Adding some more thoughts to the What, How and Why part of a generic Correlation Engine Framework that we are thinking here: What is Event Correlation How does an Event Correlation work:
Why Event Correlation - Use-Cases and Examples
|
This is great! 💯 agree this should be a core feature. Lets progress not perfection this. A few initial questions / clarifications:
|
hi @nknize , Thanks a lot for reviewing the RFC. Here are my answers.
Here is the api doc. Core Plugin apisCreate Correlation Rule for an index/data stream
List Correlations for an event stored in an index/data stream
Security Analytics Plugin ApisCreate Correlation Rules between Log Types
List all findings & their correlations within a time window
List correlations for a finding belonging to a log type
|
Agree with @sbcd90 on moving ahead with the suggested plan here and starting as a core plugin. It is also the quickest path forward, without requiring any changes in the |
What's the tl;dr of why this feature needs to be in core, especially that it's going to start as |
Thanks @dblock for the review. The Correlation Engine we are proposing here aims to provide the capability to build Events Knowledge Graph within the OpenSearch data set, which can be used to identify and store connected data events, possibly spanning across multiple indices or data streams. These knowledge graphs can further help generate insights by correlating the recent or historical data across custom time windows which users can provide. Since it provides an approach to help users correlate events across log sources, while allowing them to define their own correlation Rules, the framework itself can be leveraged by different end user plugins to solve different end use-cases such as those related to Security Analytics, Observability, geospatial or trace analytics. Going forward, once we have had baked the feature well as the core plugin, we will have further aim to provide the capability as the core module itself. |
This looks very interesting and has great potential , few point to continue the discussion
I'll be happy to discuss more on this Knowledge -Graph !! Adding the correlation metadata Knowledge into the field mapping API |
@getsaurabh02 @sbcd90 @dblock I think this should be an external plugin and not a core-plugin or module. In #7350, a new vector field type is created "correlation_vector". Shouldnt we leverage in some way the existing knn_vector type here? Is the problem that the knn_vector type is implemented as an external plugin? In neural search, we added a dependency on k-NN plugin. My concern is that we now are introducing 2 vector types in OpenSearch that have overlapping functionality, yet do not share any implementation. @sbcd90 also, please update the RFC to include field type interface that is added as well as query type interface. cc: @vamshin @navneet1v |
hi @jmazanec15, @dblock,
Events Correlation Engine is by definition an Events Knowledge Graph which can be used to identify and store connected events data spanning across multiple indices or data streams within a specific time window. Today, OpenSearch has no functionality which can correlate events or documents across different indices within a time window. Elasticsearch supports it partially with EQL but it also does not provide correlations between events within a time window. Events Correlation Engine has several use cases. Finding Correlations across findings generated from security logs RFC, Cluster Insights - to correlate metrics generate from an OS cluster within a time window, Geospatial use cases - find activites happening at different locations across the globe within a time window. Due to its diverse use cases, we decided to include the
The barebone implementation of the Events Correlation Engine is composed of 3 separate pull requests.
The main functionality of Events Correlation Engine is not to expose a As part of the design, we wanted to keep the graph storage & query part of the Events Correlation Engine flexible, by providing implementation not only for lucene hnsw graphs but also for pinecone, yang db as well as for Amazon NeptuneDB for managed service in future. Lastly, of course we can replace the lean OS to Lucene storage/query converter introduced in pr #7350 with the KNN plugin wrapper around Lucene HNSW graphs. But, currently, KNN plugin is not in core. Until it is in core, we want to continue using our converter.
The RFC currently does not include the low-level design of any of the components of the @jmazanec15, @dblock, kindly let me know if you disagree with any of the points mentioned above. also, kindly let me know if you have more questions on the design of |
I am struggling to convince myself one way or another of whether this plugin belongs in core. I think we could go either way. Maybe we should ask some other folks to get a strong opinion? @nknize? |
The foundation correlation engine absolutely makes sense as a core plugin (to start) then possibly promoted as a module. Plugins can build on the core engine for use case specific correlation rules such as for security, observability, geospatial, etc. Core use cases (e.g., general correlation across primitives) include users providing custom correlation rules for their specific use cases that can be implemented using a core default language (e.g., PPL).
For matrix stats this makes sense. Since matrix stats computes correlation and covariance matrices across multiple fields, I've long wanted to add vector field support to that aggregation. We should explore that separately. This correlation engine, on the other hand, is not mathematical correlation, it's event correlation based on user defined rules. And event correlation across documents : 💯 makes sense as a core search capability. |
I see. If it is not supposed to be used externally by users, we should make sure that it is not (not sure if this is already done). Ideally, it should just use the
I see. I understand the argument for making it a module. I am not sure in our current project structure, where plugins are developed externally, if core plugins as a concept make sense. From my understanding, it is left over from elasticsearch, which had a different philosophy around plugins. If the case for making it a core plugin is for ease of promotion to module, that makes sense. |
Hi @sbcd90
I'm sorry if I misunderstand your point of view. Elastic Security offered a feature for detection rules to search events within a time window and additional loopback time, which partially allow events to be correlated in a time window. However, this correlation cannot be done based on event context (maybe some field are the same, just like timeline feature of MS Sentinel.) Check https://www.elastic.co/guide/en/security/current/rules-ui-create.html for a ref. As SOC analysts, event correlation is really important to us and can enhance the usefulness of current detection feature that based on Sigma rules greatly. BTW, Sigma rules are currently trying to evolve to its 2.0 version, which also requires correlation. |
i am working on a opensearch security analytics project i want to create correlations between findings of log type linux system logs, one of the detector works on custom rule that triggers on authentication token and other one that uses more than 100 rules some of which are History file deletion, chmod suspicious directory, Linux Remote System Discovery...etc but all the findings of second detector are created for RuleToDetectWhenTaskDeleted rule how can i create a correlation rule for creating correlation with this available data and also If there is any dummy data available for correlation creation can someone point me in that direction |
Problem Statement
OpenSearch is a scalable, flexible, and extensible open-source software suite for search, analytics, and observability applications licensed under Apache 2.0.
OpenSearch includes a data store and search engine where customers can store their business, operational, and security data from a variety of sources & run search queries on them.
Since the various customer infrastructure events, such as security events, observability events etc, spans across multiple indices & data streams, a strong correlation across these indices (or data streams) helps customers to identify patterns and dive into the relationship of events occurring across different systems in their infrastructure.
Definitions
Events Correlation Engine
Correlation Engine is an Events Knowledge Graph which can be used to identify and store connected events data spanning across multiple indices or data streams. Also, it helps generate insights by correlating the recent/historical data based on time windows provided by the client .
The Events Correlation Engine provides an approach to help customers correlate events across log sources by allowing customers to define their own Correlation Rules exactly once, while then generating correlations between events from different log sources automatically.
Dimensions of Correlation
Time Window
Time Window is the most basic Dimension of Correlation that can be defined by the user. Correlation Engine would show all possible correlations across all indices within the specified time window if no other dimension is provided.
Source Events Indices/DataStreams
While Time Window is an important dimension of correlation, users also need to provide source events indices(or datastreams) on which Correlation rules can be defined which acts as an additional dimension of correlation.
Query Language for Correlation Rules
The most granular level of correlation supported by the Correlation Engine is using correlation rules or queries over the source events indices or datastreams. These rules allow the Correlation Engine to eliminate false positives & present to the user a list of highly accurate correlated search results.
One of the popular choices for defining Correlation Rules is Event Query Language(EQL) from Elasticsearch. EQL supports ECS today.
Here is a sample EQL based Correlation Query.
High-Level Design
There are 2 high level components in the design of Events Correlation Engine .
Correlation Query Service
This sub-system manages the lifecycle of the Correlation Rules created by the users. Users can create, update, read or delete rules using the REST apis provided by this layer.
The language for defining Correlation Rules is still not finalized. EQL is one of the examples for defining Correlation Rules.
Correlation Service
The internals of the Correlation Engine is composed of 4 major components.
Use Cases
Security Analytics Correlation Engine for correlating security events
Security Analytics is an open-source solution for security operations in OpenSearch. Security Analytics’ threat detection engine converts the detection rules into executable OpenSearch queries which are then matched against the logs or events ingested by the user to generate findings. The trigger condition filters are further applied on the findings to generate alerts.
Today in Security Analytics, the generated findings belong to individual log types & there is no way to automatically correlate between them. Users would manually need to browse through the findings generated for individual log categories & then need to identify patterns manually.
The Security Analytics Correlation Engine provides an approach to solve this issue by allowing the customers to define the correlation metadata across log categories exactly once & then generating correlations between findings from different log categories automatically.
Here is link to RFC
The text was updated successfully, but these errors were encountered: