Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

percent not null #4

Closed
drewbanin opened this issue Jul 17, 2017 · 2 comments
Closed

percent not null #4

drewbanin opened this issue Jul 17, 2017 · 2 comments
Labels

Comments

@drewbanin
Copy link
Contributor

drewbanin commented Jul 17, 2017

or more generally, % of records where X in some range

@switzer
Copy link

switzer commented Sep 23, 2020

I was looking for a macro where a percent was not null, and found this issue. I understand that you may want to make it a more general test, but here is the macro I wrote for not_null_percentage:

The test is as follows:

models:
  - name: my_table
    columns:
      - name: my_column
        tests:
          - not_null_percentage:
              minimum: 90
              maximum: 99

Minimum defaults to 0 if omitted, maximum defaults to 100 if omitted. Here is the code:

{% macro test_not_null_percentage(model, minimum=0, maximum=100) %}

{% set column_name = kwargs.get('column_name', kwargs.get('arg')) %}

WITH

validation AS (

    SELECT
    
        COUNT ({{ column_name }}) / COUNT (*) AS pct,

    FROM {{ model }}
),

validation_errors AS (

    SELECT
    
        CAST(pct * 100 AS INT64) AS pct,
        
    FROM 
    
        validation

    WHERE 

        pct <= {{ minimum }} / 100
        OR pct >= {{ maximum }} / 100
)

-- Return 0 if the number of NULL rows is within the min/max, otherwise return the percentage
SELECT IFNULL (SUM (pct), 0) FROM validation_errors

{% endmacro %}

I had to choose what to return when an error condition was met. I decided to return the percentage. The other option was determining the number of rows that were NULL. There are pros and cons to each.

If you have any feedback on how it can improve and be included in a future version of DBT, please let me know and I can submit as a pull request.

@bbrewington
Copy link

I think this functionality is now covered by "not null proportion" - https://github.com/dbt-labs/dbt-utils/tree/0.8.4/#not_null_proportion-source

@tconbeer tconbeer mentioned this issue Nov 22, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants