-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detailed consensus data correlated together. #5302
base: develop
Are you sure you want to change the base?
Conversation
4accd3d
to
cc9a392
Compare
Rebased against a more recent commit. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #5302 +/- ##
=========================================
+ Coverage 77.9% 78.0% +0.2%
=========================================
Files 791 790 -1
Lines 68006 67856 -150
Branches 8346 8237 -109
=========================================
- Hits 52967 52957 -10
+ Misses 15039 14899 -140
|
cc9a392
to
be1c95c
Compare
message. Allows quick correlation of events that previously were either not logged or, if logged, strewn across multiple lines, making correlation difficult. The Heartbeat Timer and consensus ledger accept processing each have this capability. Also guarantees that log entries will be written if the node is a validator, regardless of log severity level. Otherwise, the level of these messages is at INFO severity.
be1c95c
to
caedb46
Compare
} | ||
|
||
CLOG(clog) << "beginConsensus no previous ledge. "; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo 'ledger'
* debug log on destruction. Otherwise, INFO | ||
* level. | ||
*/ | ||
class RclConsensusLogger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean by complex activity
a consensus round?
I can see that it would be useful to collect events related to a single ledger. In this case it would make sense to add a ledger id / hash as a property to the instance of the logger. Then we can correlate those complex log entries between different nodes.
@@ -34,38 +34,55 @@ shouldCloseLedger( | |||
std::chrono::milliseconds openTime, // Time waiting to close this ledger | |||
std::chrono::milliseconds idleInterval, | |||
ConsensusParms const& parms, | |||
beast::Journal j) | |||
beast::Journal j, | |||
std::unique_ptr<std::stringstream> const& clog) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding a new parameter to pass clog, should we extend beast::Journal to support streamed logging? Otherwise, we have to pass as a parameter two loggers now. It is our version of beast and we are free to modify it. We might have to switch to shared_ptr since beast::Journal is passed by value, but otherwise, it should be possible.
{ | ||
std::string header_; | ||
beast::Journal j_; | ||
std::unique_ptr<std::stringstream> ss_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need std::unique_ptr
if the clog is passed by reference everywhere?
I don't see a need for transfer of ownership.
Combine multiple related debug log data points into a single
message. Allows quick correlation of events that
previously were either not logged or, if logged, strewn across multiple lines, making correlation difficult. The Heartbeat Timer and consensus ledger accept processing each have this capability.
Also guarantees that log entries will be written if the node is a validator, regardless of log severity level. Otherwise, the level of these messages is at INFO severity.
High Level Overview of Change
Enable logging of multiple data points so that they are emitted to the debug log at the same time. Validators will have these messages always emitted, regardless
of other configuration of debug log severity level.
Context of Change
The context is to facilitate data collection and troubleshooting. Currently, this much
of the data in the PR is not being logged. And for that which is being logged, it's
written all over the debug log on multiple lines. This makes it difficult to correlate
consensus-related events.
And because this change emits the logs automatically for validators, no special configuration of severity level needs to be made.
Type of Change
.gitignore
, formatting, dropping support for older tooling)API Impact
libxrpl
change (any change that may affectlibxrpl
or dependents oflibxrpl
)New, verbose log entries are created. They can be grepped with the string "ConsensusLogger". They are emitted approximately every second, with an additional entry for each consensus "accept" phase. It increases logging requirements. Estimated 5MB/hr.
To see the log messages, grep for:
"ConsensusLogger: Heartbeat Timer" and "ConsensusLogger: onAccept"
If configured as a validator, all debug log severity levels will cause these messages to be emitted (though I'm not sure if kDisabled is over-ridden). To test, configure at warning, error, or fatal, and the log messages should still be emitted. They'll still say they're NFO (info) level, but they bypass the filter. If not configured as a validator, verify that the messages only appear for "info" level or more verbose.
Before committing to this, the community should be informed of the new feature since it will require more disk space, at least for validator operators or people with very terse logging configured. However, the disk requirements are not very large: roughly 5MB per hour. This is < 1GB/week, which is quite small for the benefit.