Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event recovery from a missing block #554

Merged
merged 6 commits into from
Sep 17, 2024
Merged

Event recovery from a missing block #554

merged 6 commits into from
Sep 17, 2024

Conversation

devbugging
Copy link
Contributor

@devbugging devbugging commented Sep 16, 2024

Description

Recover from a missing block in the event stream. This can happen due to issues with system transactions.


For contributor use:

  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the standards mentioned here.
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

Summary by CodeRabbit

  • New Features

    • Introduced new error types for improved error handling: ErrMissingBlock and ErrMissingTransactions.
    • Enhanced RPCSubscriber functionality with recovery mechanisms for event subscription.
    • Added a new test case to validate subscriber behavior for missing block events.
  • Bug Fixes

    • Improved error messages for better clarity and context in event decoding.
  • Refactor

    • Renamed fetchBlockEvents to fetchMissingData for better clarity.
    • Added methods for accumulating and recovering missing events.
    • Streamlined event generation process in test utilities.

Copy link
Contributor

coderabbitai bot commented Sep 16, 2024

Walkthrough

The pull request introduces enhancements to error handling across multiple files, including the addition of new error variables for missing blocks and transactions. It updates the decodeCadenceEvents function to utilize custom error types, improves the RPCSubscriber struct with recovery mechanisms, and renames functions for clarity. These changes collectively aim to strengthen the robustness of event subscription and error reporting within the application.

Changes

File Change Summary
models/errors/errors.go Added ErrMissingBlock and ErrMissingTransactions for improved error handling.
models/events.go Updated error handling in decodeCadenceEvents to use custom error types and improved messages.
services/ingestion/subscriber.go Enhanced RPCSubscriber with recovery mechanisms, added fields for recovery state, and renamed functions for clarity.
services/ingestion/subscriber_test.go Added Test_MissingBlockEvent to test behavior when a block event is missing.
services/testutils/mock_client.go Introduced SetupClient to streamline mock client setup for block events.

Possibly related PRs

Suggested labels

Improvement, Bugfix

Poem

🐇 In the code where errors play,
New flags arise to light the way.
Blocks and transactions, now we see,
With clearer paths, we hop with glee!
Recovery's here, a joyful cheer,
For every missing piece, we steer! 🌟


Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 5f3f789 and 6dd8b6d.

Files selected for processing (2)
  • services/ingestion/subscriber.go (6 hunks)
  • services/ingestion/subscriber_test.go (2 hunks)
Additional comments not posted (10)
services/ingestion/subscriber.go (9)

37-39: LGTM!

The added fields recovery and recoveredEvents in the RPCSubscriber struct are appropriate for managing recovery states and storing potentially missing events during the recovery process.


114-114: Looks good!

Renaming events to eventsChan improves code clarity by explicitly indicating it's a channel.


119-120: Error handling looks good!

The updated error handling logic to send the error on the eventsChan and return immediately is correct. This allows the consumer to handle the error gracefully and avoids blocking the goroutine unnecessarily.

Also applies to: 125-128


153-160: Recovery logic looks good!

The added recovery logic in the event processing code is correct:

  • It checks if the subscriber is in recovery mode or if the evmEvents contain an error.
  • If so, it calls the recover method to attempt recovery.
  • If still in recovery mode after calling recover, it continues to the next event, preventing the subscriber from getting stuck.

This helps handle scenarios where the subscriber encounters errors while processing events and attempts to recover gracefully.


172-174: Error handling for errChan looks good!

The updated error handling logic for errChan is correct:

  • When an error is received on errChan, it sends an ErrDisconnected error on the eventsChan to notify the consumer about the disconnection.
  • It then returns to stop further event processing.

This allows the consumer to handle the disconnection error appropriately.

Also applies to: 176-177


Line range hint 270-307: Method renaming and implementation look good!

The renaming of fetchBlockEvents to fetchMissingData accurately reflects the method's purpose of fetching missing data when the event streaming API returns an inconsistent response.

The implementation is correct:

  • It removes existing events from blockEvents to ensure only the recovered events are returned.
  • It fetches the missing events using GetEventsForHeightRange as a backup mechanism.
  • It appends the recovered events to blockEvents.Events and returns the updated blockEvents.

This provides a reliable way to handle inconsistent responses and recover missing data.


309-324: New method implementation looks good!

The new method accumulateEventsMissingBlock is implemented correctly:

  • It accumulates transaction events until it can produce a valid EVM block event containing a block and transactions, helping recover from missing block scenarios.
  • It appends the received events to r.recoveredEvents and updates events.Events with the accumulated events to ensure all the accumulated events are included.
  • It sets r.recovery to true if the recovered block events still have an error, otherwise it resets r.recovery and clears r.recoveredEvents, which is appropriate.

This method provides a mechanism to handle missing block scenarios and recover a valid block event.


326-350: New method implementation looks good!

The new method recover is implemented correctly to handle invalid data sent over the event stream:

  • It logs a warning message with the error details when entering recovery mode, which helps with debugging and monitoring.
  • It calls accumulateEventsMissingBlock when the error is ErrMissingBlock or already in recovery mode, which is appropriate to handle missing block scenarios.
  • It calls fetchMissingData when the error is ErrMissingTransactions, which is correct to fetch missing transaction data.
  • For any other error, it returns a BlockEventsError with the original error, which is a reasonable fallback.

This method provides a centralized recovery mechanism to handle different error scenarios and attempt to recover valid data.


337-340: Logger usage looks good!

The usage of the logger in the recover method is appropriate:

  • Logging a warning message when entering recovery mode is useful for debugging and monitoring purposes.
  • Including the error details in the log message provides valuable context for understanding the issue.
  • Logging the Flow block height helps identify the specific block where the recovery mode was triggered.

This logging statement helps with troubleshooting and monitoring the recovery process.

services/ingestion/subscriber_test.go (1)

70-155: Excellent test for missing block scenario!

This test function is well-structured, thoroughly verifies the expected behavior of the RPCSubscriber when encountering a missing block in the event stream, and demonstrates effective use of goroutines and assertions. The test simulates the scenario accurately by selectively removing the block event based on the block height and validates that the subscriber identifies the missing block and includes all the missing transactions in the subsequent found block.

The test enhances the overall test coverage and helps ensure the reliability and correctness of the subscriber in handling event stream inconsistencies gracefully. It serves as a valuable regression test to prevent future bugs related to missing block handling.

Great job on adding this comprehensive test!


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    -- I pushed a fix in commit <commit_id>, please review it.
    -- Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    -- @coderabbitai generate unit testing code for this file.
    -- @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    -- @coderabbitai generate interesting stats about this repository and render them as a table.
    -- @coderabbitai read src/utils.ts and generate unit testing code.
    -- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    -- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Collaborator

@m-Peter m-Peter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

2 participants