Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[source-chargebee] Replace IncrementalSingleSliceCursor with semi-incremental DatetimeBasedCursor #53220

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

topefolorunso
Copy link
Collaborator

@topefolorunso topefolorunso commented Feb 7, 2025

What

Replace custom IncrementalSingleSliceCursor component with low-code semi-incremental DatetimeBasedCursor closes https://github.com/airbytehq/airbyte-internal-issues/issues/11512

How

The feature specifically to replace is to use the is_client_side_incremental option which enables semi-incremental syncing and replaces the need for a custom component.

Review guide

  1. manifest.yaml
  2. components.py
  3. test_components.py

User Impact

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

Acceptance Criteria

  • source-chargebee is running on the latest version of CDK v6 (>= 6.33.0)
  • The gift_stream stream does not use the IncrementalSingleSliceCursor
  • The unbilled_charge_stream stream does not use the IncrementalSingleSliceCursor
  • The site_migration_detail_stream stream do not use the IncrementalSingleSliceCursor
  • The gift_stream, unbilled_charge_stream, site_migration_detail_stream supports syncing incrementally when given a state message
  • IncrementalSingleSliceCursor is deleted from components.py
  • All tests in /unit_tests pass.
  • All CI checks are passing ✅
  • Live traffic regression test suite run with passing results
  • Progressive rollout starting with 20%, 50%, followed by release of the candidate version

Copy link

vercel bot commented Feb 7, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 8, 2025 7:02am

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/chargebee labels Feb 7, 2025
@topefolorunso
Copy link
Collaborator Author

topefolorunso commented Feb 7, 2025

/format-fix

Format-fix job started... Check job output.

🟦 Job completed successfully (no changes).

@topefolorunso
Copy link
Collaborator Author

@brianjlai @natikgadzhi @DanyloGL Please review

Copy link
Contributor

@brianjlai brianjlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes overall look good, but see my comment about using the latest version of the CDK.

Once @topefolorunso makes the change to use the latest source-declarative-manifest version, @DanyloGL please review this and run regression tests on @topefolorunso 's PR?

f we get a passing regression test and passing CI checks then we are all set to merge this in.

@topefolorunso
Copy link
Collaborator Author

@DanyloGL All good now.

Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is green and with @brianjlai's comments I am happy with this.

@DanyloGL grab this for a regression test and ship this if things are clean!

@DanyloGL
Copy link
Collaborator

@natikgadzhi regressions shows that target version retrieves less data. Contact stream does not retrieve data at all. Will run more regression with specific connections.

@brianjlai
Copy link
Contributor

@natikgadzhi regressions shows that target version retrieves less data. Contact stream does not retrieve data at all. Will run more regression with specific connections.

@DanyloGL thanks for kicking of the regression test run. This is certainly interesting, but also potentially unrelated. Looking at the 3 streams affected

  • unbilled_charge : I see 2 extra records on the target version. I think this is tolerable because sometimes we're a little more inclusive with our new cursor on edges or can chalk this up to skew if target/candidate don't run a sync at exactly the same time. And for 1500 records, thats certainly a possible situation since this seems to be a pretty high volume stream
  • gift: Stream not on the regression run
  • site_migration_detail: Stream not on the regression run

contacts not sending data at all is pretty odd and we should retest, but this stream is not affected by @topefolorunso 's changes since it does not have custom components and it is also a full-refresh stream. And for many of the other streams, those also should not be impacted.

@DanyloGL let's see if you find a more targeted run, but if we see minimal or no mismatches on unbilled_charge, site_migration_detail, and gift streams, I think we move forward with the progressive rollout

@DanyloGL
Copy link
Collaborator

@brianjlai there are only few cloud connections that use needed streams:

  • here unbilled_charge retrieves the same number of record as in prod.
    - site_migration_detail, and gift streams on some connections are empty, and with one connection run wasn't successful
  • contact stream always returned 0 record during several regression runs

Also created connection on cloud with dev version using our sandbox:

  • site_migration_detailstream is empty on our sanbox
  • all others streams mentioned above (including contact stream) retrieve the same amount of data as prod version (however we have only few records on each of these streams)

@DanyloGL
Copy link
Collaborator

@brianjlai I'm running regression without read command with state. Let's see results and decide if we can merge these changes

@brianjlai
Copy link
Contributor

brianjlai commented Feb 11, 2025

@DanyloGL thanks for the very detailed testing. love that you tried replicating this for the questionable contacts stream. I still don't see a correlation between @topefolorunso 's changes and contacts and the fact that we tested this on our side and contacts is emitting records does make it seem like this is a testing issue. The rest of the results being off by one seem acceptable and we do see unbilled_charge match the count exactly.

feel free to just start the progressive rollout. I think this is ready to go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/chargebee
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants