This repository was archived by the owner on May 12, 2021. It is now read-only.
METRON-590 Enable Use of Event Time in Profiler #965
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This enables the use of event time processing in the Profiler.
By default, the Profiler will still use processing time. If you configure the profiler with a
timestampField
then it will extract the timestamps from that field contained within the incoming telemetry.Review the updates that have made to the Profiler README for more details on how this works and how it should be configured.
Changes
The core Profiler classes in
metron-profiler-common
were refactored slightly so that time is always injected from the outside. When run in Storm, the Storm bolts are responsible for keeping track of time. When run in the REPL, theStandAloneProfiler
is responsible for keeping track of time.A
Clock
abstraction was created (er enhanced?) to handle the differences between processing time and event time. If operating in processing time, theWallClock
implementation will be used. If operating in event time, theEventTimeClock
will be used. Given a message, aClock
will always tell us what time it is.A
ClockFactory
will give us the rightClock
based on what the Profiler configuration looks-like. This factory is used in both the Storm and REPL interfaces.The
ProfileSplitterBolt
takes a message, uses aClock
to get a timestamp, then sends the message and the correct timestamp to the downstreamProfileBuilderBolt
s. TheProfileBuilderBolt
trusts the timestamp that it was given and continues on its merry way.The
ProfilerBuilderBolt
uses aFlushSignal
to know when to flush. There is some important subtlety contained in this logic. If you accidentally inject a message with a recent timestamp when you are running with older, archived data, it will prevent the system from flushing when you expect it to. This is because the new message advances time to the most recent timestamp. I might of run into this when testing. :) I added an important check and log statement to help make this very noticeable.There was one difference between processing and event time that had to be accounted for. When in processing time, if a profile stops receiving messages that profile still needs to flush at the end of the period. When in event time this is not the case. In event time, if you stop receiving messages, time effectively does not advance. This was accounted for by creating the concept of "expired" profiles in the
MessageDistributor
. TheProfileBuilderBolt
s then use a tick tuple to periodically flush expired profiles. See the javadoc for more explanation.Additional properties were added to adjust the event time processing logic that we leverage in Storm. This includes specifying a window length and a window lag. There will be 1 or more windows in each profile period. A smaller window lets the profiler process a smaller chunk of messages at a time. The window lag allows you to adjust the Profiler depending on how out-of-order your incoming telemetry is.
The Mpack was updated to support the additional properties.
I added a lot of useful logging to help troubleshoot and debug issues when running in Storm. If you go into the Storm UI and turn on DEBUG level logging for
org.apache.metron.profiler
, you will get some useful information in the worker logs.Manual Testing
This change can be tested manually when the Profiler is running atop Storm or when run in the REPL.
Testing in the REPL
Create a simple profile and define a "timestampField" in the Profiler configuration. This will tell the Profiler to operate using event time.
Create a message that has a timestamp. In this example, the timestamp is really old, like 1970 old.
Create the Profiler.
Apply the message to the Profiler.
Flush the Profiler. Notice that the 'period' of the measurement that was produced is also from 1970, which indicates that the Profiler successfully used event time.
Now let's do the same, but using processing time. Use the same profile, but this time do not specify a 'timestampField'.
Now run through the same steps. Notice how the period of the measurement is based on system time now.
Testing in Storm
Launch a development environment. Shutdown Indexing, Elasticsearch, Kibana, YARN, and MapReduce2 to avoid any resource issues.
Using Ambari, change the following settings and restart the Profiler.
Set the "Period Duration" to 1 minute.
Set the "Window Duration" to 15 seconds.
Set the "Window Lag" to 30 seconds.
Replace
/opt/sensor-stubs/bin/start-bro-stub
with the following.Instead of adding the current time into each Bro message, this will add a timestamp from 1 day ago.
Restart the Bro Sensor Stub.
Open up the REPL and configure the Profiler like so.
Notice that we are setting the 'timestampField' within the Profiler configuration. This will tell the Profiler to extract the timestamp from this field rather than using system time.
Query the Profiler data store. This will take a minute or so until you see a value written.
Now query back just a couple hours instead. Notice that you should get no results. This indicates that the Profiler successfully used the timestamp from the Bro data which contained day old values.
Now change the Profiler configuration to remove the "timestampField" setting. This will switch the Profiler back to using system aka processing time.
The Profiler will pick-up the change after the next flush event. Query for profile data in the past few minutes. This shows that the Profiler has switched back to use system time aka processing time.
In Storm you can also set logging to DEBUG for "org.apache.metron.profiler". This will output detailed worker logs that allows you to also verify that the profiler is using the correct timestamps.
Pull Request Checklist