Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize email list id unnesting #28

Merged
merged 10 commits into from
May 3, 2023

Conversation

fivetran-jamie
Copy link
Contributor

@fivetran-jamie fivetran-jamie commented Apr 27, 2023

Are you a current Fivetran customer?

internal

What change(s) does this PR introduce?

  • updates Redshift's method for unnesting email list ids in int_iterable__list_user_unnest
  • adds incremental materialization to int_iterable__list_user_unnest
    • adds new unique_key field to this model to serve as the incremental key
    • adds new date_day field to partition by
  • changes materialization of int_iterable__list_user_history from view to table so we don't need to recreate it each time
  • adds coalesce to previous_email_ids to the int_iterable__list_user_history model in case there are no previous email ids (ie the lag returns null)
  • adds comments/makes comment style consistent
  • updates flatten syntax for Snowflake people

Did you update the CHANGELOG?

  • Yes

Does this PR introduce a breaking change?

  • Yes (please provide breaking change details below.)
  • No (please provide an explanation as to how the change is non-breaking below.)

they need to run a full-refresh first to get thew new unique_key column totally populated

Did you update the dbt_project.yml files with the version upgrade (please leverage standard semantic versioning)? (In both your main project and integration_tests)

  • Yes

Is this PR in response to a previously created Bug or Feature Request

How did you test the PR changes?

  • Buildkite
  • Local (please provide additional testing details below)

I tested on our seed data, using the currently published version of the package and this working branch. i compared the output of int_iterable__list_user_unnest and confirmed that list_ids were split out identically in Redshift using the old generate_series method vs json_parse.
image

Select which warehouse(s) were used to test the PR

  • BigQuery
  • Redshift
  • Snowflake
  • Postgres
  • Databricks
  • Other (provide details below)

Provide an emoji that best describes your current mood

🍹

Feedback

We are so excited you decided to contribute to the Fivetran community dbt package! We continue to work to improve the packages and would greatly appreciate your feedback on our existing dbt packages or what you'd like to see next.

@fivetran-jamie fivetran-jamie self-assigned this Apr 27, 2023
@fivetran-jamie fivetran-jamie marked this pull request as ready for review May 1, 2023 19:22
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-jamie thanks for working through this! I do have a few questions and requests for more documentation around how you tested the incremental strategy across warehouses. I am a bit suspicious since we are not testing the incremental strategy in buildkite that it may not be working as expected in other warehouses.

Would you be able to provide some validation on your end as to how this incremental approach works for the insert_overwrite and delete+insert strategies? This is the first time we are implementing these strategies so I want to be extra critical and ensure they are working as expected. Let me know if you have any questions!

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! However, there was one small adjustment I noticed when doing my validation steps that Snowflake needed to be corrected. Otherwise the query resulted in no records 😱.

I will go over the validation steps during standup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants