Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Constant Expressions Error for Redshift #32

Open
1 of 4 tasks
fivetran-joemarkiewicz opened this issue Aug 15, 2024 · 3 comments
Open
1 of 4 tasks

[Bug] Constant Expressions Error for Redshift #32

fivetran-joemarkiewicz opened this issue Aug 15, 2024 · 3 comments
Assignees
Labels
error:unforced good first issue Good for newcomers status:in_review Currently in review type:bug Something is broken or incorrect type:wontfix This will not be worked on

Comments

@fivetran-joemarkiewicz
Copy link
Contributor

fivetran-joemarkiewicz commented Aug 15, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

Issue Details

Since the introduction of the union data feature in v0.10.0 we have seen an issue be introduced where Redshift users who are not leveraging the union data feature are seeing a constant expression error (see relevant error log for full error output). This error is caused because the union data macro will create the source_relation field and will insert an empty string if no union schemas/databases are provided in the respective variables. Therefore, since we have the empty string for all fields it is treated as a constant expression and therefore is not able to be included in a partition statement.

This issue commonly only happens when the upstream source table is not present (the package creates an empty staging model), but it can also happen when the upstream source table is present. We previously addressed this issue by applying the following code update in other Ad Reporting packages.

        case when id is null and _fivetran_synced is null 
            then row_number() over (partition by source_relation order by source_relation)
        else row_number() over (partition by source_relation, id order by _fivetran_synced desc) end = 1 as is_most_recent_record

While this does seem to work to address this constant expression issue when the upstream table is not present, it has been found to not address the issue when the constant expression issue is persisting even when the upstream source table is present.

Thankfully, in the recent release of the dbt_aws_cloud_cost dbt package we found a creative solution to this issue! Instead of the lengthy case when statement from before, we found that we can remove the source_relation field from the partition if the respective union schemas/databases variable is empty (since this issue does not occur when using the union schema variable. Therefore, the following code update in dbt_aws_cloud_cost addressed this very issue.

row_number() over (partition by bill_payer_account_id {{ ", source_relation" if var('aws_cloud_cost_sources', []) | length > 1 }} order by latest_start_date desc) = 1 as is_latest_name

Solution

We found that in some cases one solution works but the other doesn't. Fortunately, in the latest Twitter Organic PR #11 we found a solution to this very problem. To completely address the constant expression issue we will do the following:

  • Create a macro titled is_table_emtpy and change the dispatch to reference pinterest_source.
  • Creata a macro titled result_if_table_exists and change the dispatch to reference pinterest_source.
  • Where we do the window functions we apply a similar update to this.
  • Apply the above macro reference to each impacted window function in this package and any downstream cases.

This will then need to be applied to all other window functions in this package. Please ensure all window functions receive this update. From my initial scope I imagine this only needs to be applied to the is_most_recent_record fields in the *_history staging models.

Relevant error log or model output

constant expressions are not supported in partition by clauses compiled code

Expected behavior

The models are able to compile successfully on Redshift regardless of the constant expressions generated from the union data feature.

dbt Project configurations

Normal schema and database variables. No other custom configuration.

Package versions

packages:
  - package: fivetran/pinterest_source
    version: [">=0.10.0", "<0.11.0"]

What database are you using dbt with?

redshift

dbt Version

v1.8.x

Additional Context

It may also be worthwhile to inspect the downstream models to ensure we do or do not need to make similar updates there.

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.
@fivetran-joemarkiewicz fivetran-joemarkiewicz added good first issue Good for newcomers type:bug Something is broken or incorrect labels Aug 15, 2024
@fivetran-joemarkiewicz
Copy link
Contributor Author

fivetran-joemarkiewicz commented Aug 17, 2024

Unfortunately when working through a similar update I found that we may still need to include the case when statement along with the variable config to include/exclude the source_relation in the partition statement.

The issue seems to be that if there is no source table identified and an empty table is provided it will still have null records for the other fields and therefore has a high likelihood of failing with the constant expression issue. I will investigate and share my findings on what the correct path forward here should be to address the Redshift constant expression issue while also not impacting other warehouses.

Edit: Please see the updated issue description which takes into account the issue identified and the solution we can apply to address this.

@JessicaKMarkiewicz
Copy link

@fivetran-joemarkiewicz I was able to address this issue in the above PR. Let me know if you have any questions.

@fivetran-joemarkiewicz
Copy link
Contributor Author

Thanks so much for opening the PR @JessicaKMarkiewicz! I'll take a look at this today and let you know if I have any questions and I'll also share what you can expect regarding next steps for moving your PR along.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error:unforced good first issue Good for newcomers status:in_review Currently in review type:bug Something is broken or incorrect type:wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants