From c2c51a1479fede788e0786fcd8fb5249a5529e8f Mon Sep 17 00:00:00 2001 From: Joel Labes Date: Mon, 31 Jan 2022 20:18:07 +1300 Subject: [PATCH 01/17] Create create-table-of-contents.yml --- .../workflows/create-table-of-contents.yml | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+) create mode 100644 .github/workflows/create-table-of-contents.yml diff --git a/.github/workflows/create-table-of-contents.yml b/.github/workflows/create-table-of-contents.yml new file mode 100644 index 00000000..ef1a44fc --- /dev/null +++ b/.github/workflows/create-table-of-contents.yml @@ -0,0 +1,23 @@ +# This is a basic workflow to help you get started with Actions + +name: CI + +# Controls when the workflow will run +on: + push: + branches: [main] + paths: ['README.md'] + +jobs: + build: + runs-on: ubuntu-latest + timeout-minutes: 5 + steps: + - uses: actions/checkout@v2 + - run: | + curl https://raw.githubusercontent.com/ekalinin/github-markdown-toc/master/gh-md-toc -o gh-md-toc + chmod a+x gh-md-toc + ./gh-md-toc --insert --no-backup README.md + - uses: stefanzweifel/git-auto-commit-action@v4 + with: + commit_message: Auto update markdown TOC From ef6459cbd6741f9fa108556dbaff4b064e8062e4 Mon Sep 17 00:00:00 2001 From: Joel Labes Date: Mon, 31 Jan 2022 20:18:25 +1300 Subject: [PATCH 02/17] test TOC --- README.md | 60 ++----------------------------------------------------- 1 file changed, 2 insertions(+), 58 deletions(-) diff --git a/README.md b/README.md index 66a3ac99..a17525f6 100644 --- a/README.md +++ b/README.md @@ -8,64 +8,8 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this ---- ## Contents - -**[Schema tests](#schema-tests)** - - [equal_rowcount](#equal_rowcount-source) - - [fewer_rows_than](#fewer_rows_than-source) - - [equality](#equality-source) - - [expression_is_true](#expression_is_true-source) - - [recency](#recency-source) - - [at_least_one](#at_least_one-source) - - [not_constant](#not_constant-source) - - [cardinality_equality](#cardinality_equality-source) - - [unique_where](#unique_where-source) - - [not_null_where](#not_null_where-source) - - [not_null_proportion](#not_null_proportion-source) - - [relationships_where](#relationships_where-source) - - [mutually_exclusive_ranges](#mutually_exclusive_ranges-source) - - [unique_combination_of_columns](#unique_combination_of_columns-source) - - [accepted_range](#accepted_range-source) - -**[Macros](#macros)** - -- [Introspective macros](#introspective-macros): - - [get_column_values](#get_column_values-source) - - [get_relations_by_pattern](#get_relations_by_pattern-source) - - [get_relations_by_prefix](#get_relations_by_prefix-source) - - [get_query_results_as_dict](#get_query_results_as_dict-source) - -- [SQL generators](#sql-generators) - - [date_spine](#date_spine-source) - - [haversine_distance](#haversine_distance-source) - - [group_by](#group_by-source) - - [star](#star-source) - - [union_relations](#union_relations-source) - - [generate_series](#generate_series-source) - - [surrogate_key](#surrogate_key-source) - - [safe_add](#safe_add-source) - - [pivot](#pivot-source) - - [unpivot](#unpivot-source) - -- [Web macros](#web-macros) - - [get_url_parameter](#get_url_parameter-source) - - [get_url_host](#get_url_host-source) - - [get_url_path](#get_url_path-source) - -- [Cross-database macros](#cross-database-macros): - - [current_timestamp](#current_timestamp-source) - - [dateadd](#dateadd-source) - - [datediff](#datediff-source) - - [split_part](#split_part-source) - - [last_day](#last_day-source) - - [width_bucket](#width_bucket-source) - -- [Jinja Helpers](#jinja-helpers) - - [pretty_time](#pretty_time-source) - - [pretty_log_format](#pretty_log_format-source) - - [log_info](#log_info-source) - -[Materializations](#materializations): -- [insert_by_period](#insert_by_period-source) + + --- ### Schema Tests From 7695717bea58ff7a783566c24fc4d5102705f3a6 Mon Sep 17 00:00:00 2001 From: joellabes Date: Mon, 31 Jan 2022 07:18:54 +0000 Subject: [PATCH 03/17] Auto update markdown TOC --- README.md | 64 ++++++++++ gh-md-toc | 361 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 425 insertions(+) create mode 100755 gh-md-toc diff --git a/README.md b/README.md index a17525f6..507a085e 100644 --- a/README.md +++ b/README.md @@ -9,6 +9,70 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this ---- ## Contents + * [Installation Instructions](#installation-instructions) + * [Compatibility matrix](#compatibility-matrix) + * [Contents](#contents) + * [Schema Tests](#schema-tests) + * [equal_rowcount (source)](#equal_rowcount-source) + * [fewer_rows_than (source)](#fewer_rows_than-source) + * [equality (source)](#equality-source) + * [expression_is_true (source)](#expression_is_true-source) + * [recency (source)](#recency-source) + * [at_least_one (source)](#at_least_one-source) + * [not_constant (source)](#not_constant-source) + * [cardinality_equality (source)](#cardinality_equality-source) + * [unique_where (source)](#unique_where-source) + * [not_null_where (source)](#not_null_where-source) + * [not_null_proportion (source)](#not_null_proportion-source) + * [not_accepted_values (source)](#not_accepted_values-source) + * [relationships_where (source)](#relationships_where-source) + * [mutually_exclusive_ranges (source)](#mutually_exclusive_ranges-source) + * [sequential_values (source)](#sequential_values-source) + * [unique_combination_of_columns (source)](#unique_combination_of_columns-source) + * [accepted_range (source)](#accepted_range-source) + * [Macros](#macros) + * [Introspective macros](#introspective-macros) + * [get_column_values (source)](#get_column_values-source) + * [get_relations_by_pattern (source)](#get_relations_by_pattern-source) + * [get_relations_by_prefix (source)](#get_relations_by_prefix-source) + * [get_query_results_as_dict (source)](#get_query_results_as_dict-source) + * [SQL generators](#sql-generators) + * [date_spine (source)](#date_spine-source) + * [haversine_distance (source)](#haversine_distance-source) + * [group_by (source)](#group_by-source) + * [star (source)](#star-source) + * [union_relations (source)](#union_relations-source) + * [generate_series (source)](#generate_series-source) + * [surrogate_key (source)](#surrogate_key-source) + * [safe_add (source)](#safe_add-source) + * [pivot (source)](#pivot-source) + * [unpivot (source)](#unpivot-source) + * [Web macros](#web-macros) + * [get_url_parameter (source)](#get_url_parameter-source) + * [get_url_host (source)](#get_url_host-source) + * [get_url_path (source)](#get_url_path-source) + * [Cross-database macros](#cross-database-macros) + * [current_timestamp (source)](#current_timestamp-source) + * [dateadd (source)](#dateadd-source) + * [datediff (source)](#datediff-source) + * [split_part (source)](#split_part-source) + * [date_trunc (source)](#date_trunc-source) + * [last_day (source)](#last_day-source) + * [width_bucket (source)](#width_bucket-source) + * [Jinja Helpers](#jinja-helpers) + * [pretty_time (source)](#pretty_time-source) + * [pretty_log_format (source)](#pretty_log_format-source) + * [log_info (source)](#log_info-source) + * [slugify (source)](#slugify-source) + * [Materializations](#materializations) + * [insert_by_period (source)](#insert_by_period-source) + * [Contributing](#contributing) + * [Dispatch macros](#dispatch-macros) + * [Getting started with dbt](#getting-started-with-dbt) + * [Code of Conduct](#code-of-conduct) + + + --- diff --git a/gh-md-toc b/gh-md-toc new file mode 100755 index 00000000..ac5e183f --- /dev/null +++ b/gh-md-toc @@ -0,0 +1,361 @@ +#!/usr/bin/env bash + +# +# Steps: +# +# 1. Download corresponding html file for some README.md: +# curl -s $1 +# +# 2. Discard rows where no substring 'user-content-' (github's markup): +# awk '/user-content-/ { ... +# +# 3.1 Get last number in each row like ' ... sitemap.js.*<\/h/)+2, RLENGTH-5) +# +# 5. Find anchor and insert it inside "(...)": +# substr($0, match($0, "href=\"[^\"]+?\" ")+6, RLENGTH-8) +# + +gh_toc_version="0.8.0" + +gh_user_agent="gh-md-toc v$gh_toc_version" + +# +# Download rendered into html README.md by its url. +# +# +gh_toc_load() { + local gh_url=$1 + + if type curl &>/dev/null; then + curl --user-agent "$gh_user_agent" -s "$gh_url" + elif type wget &>/dev/null; then + wget --user-agent="$gh_user_agent" -qO- "$gh_url" + else + echo "Please, install 'curl' or 'wget' and try again." + exit 1 + fi +} + +# +# Converts local md file into html by GitHub +# +# -> curl -X POST --data '{"text": "Hello world github/linguist#1 **cool**, and #1!"}' https://api.github.com/markdown +#

Hello world github/linguist#1 cool, and #1!

'" +gh_toc_md2html() { + local gh_file_md=$1 + URL=https://api.github.com/markdown/raw + + if [ ! -z "$GH_TOC_TOKEN" ]; then + TOKEN=$GH_TOC_TOKEN + else + TOKEN_FILE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/token.txt" + if [ -f "$TOKEN_FILE" ]; then + TOKEN="$(cat $TOKEN_FILE)" + fi + fi + if [ ! -z "${TOKEN}" ]; then + AUTHORIZATION="Authorization: token ${TOKEN}" + fi + + # echo $URL 1>&2 + OUTPUT=$(curl -s \ + --user-agent "$gh_user_agent" \ + --data-binary @"$gh_file_md" \ + -H "Content-Type:text/plain" \ + -H "$AUTHORIZATION" \ + "$URL") + + if [ "$?" != "0" ]; then + echo "XXNetworkErrorXX" + fi + if [ "$(echo "${OUTPUT}" | awk '/API rate limit exceeded/')" != "" ]; then + echo "XXRateLimitXX" + else + echo "${OUTPUT}" + fi +} + + +# +# Is passed string url +# +gh_is_url() { + case $1 in + https* | http*) + echo "yes";; + *) + echo "no";; + esac +} + +# +# TOC generator +# +gh_toc(){ + local gh_src=$1 + local gh_src_copy=$1 + local gh_ttl_docs=$2 + local need_replace=$3 + local no_backup=$4 + local no_footer=$5 + + if [ "$gh_src" = "" ]; then + echo "Please, enter URL or local path for a README.md" + exit 1 + fi + + + # Show "TOC" string only if working with one document + if [ "$gh_ttl_docs" = "1" ]; then + + echo "Table of Contents" + echo "=================" + echo "" + gh_src_copy="" + + fi + + if [ "$(gh_is_url "$gh_src")" == "yes" ]; then + gh_toc_load "$gh_src" | gh_toc_grab "$gh_src_copy" + if [ "${PIPESTATUS[0]}" != "0" ]; then + echo "Could not load remote document." + echo "Please check your url or network connectivity" + exit 1 + fi + if [ "$need_replace" = "yes" ]; then + echo + echo "!! '$gh_src' is not a local file" + echo "!! Can't insert the TOC into it." + echo + fi + else + local rawhtml=$(gh_toc_md2html "$gh_src") + if [ "$rawhtml" == "XXNetworkErrorXX" ]; then + echo "Parsing local markdown file requires access to github API" + echo "Please make sure curl is installed and check your network connectivity" + exit 1 + fi + if [ "$rawhtml" == "XXRateLimitXX" ]; then + echo "Parsing local markdown file requires access to github API" + echo "Error: You exceeded the hourly limit. See: https://developer.github.com/v3/#rate-limiting" + TOKEN_FILE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/token.txt" + echo "or place GitHub auth token here: ${TOKEN_FILE}" + exit 1 + fi + local toc=`echo "$rawhtml" | gh_toc_grab "$gh_src_copy"` + echo "$toc" + if [ "$need_replace" = "yes" ]; then + if grep -Fxq "" $gh_src && grep -Fxq "" $gh_src; then + echo "Found markers" + else + echo "You don't have or in your file...exiting" + exit 1 + fi + local ts="<\!--ts-->" + local te="<\!--te-->" + local dt=`date +'%F_%H%M%S'` + local ext=".orig.${dt}" + local toc_path="${gh_src}.toc.${dt}" + local toc_footer="" + # http://fahdshariff.blogspot.ru/2012/12/sed-mutli-line-replacement-between-two.html + # clear old TOC + sed -i${ext} "/${ts}/,/${te}/{//!d;}" "$gh_src" + # create toc file + echo "${toc}" > "${toc_path}" + if [ "${no_footer}" != "yes" ]; then + echo -e "\n${toc_footer}\n" >> "$toc_path" + fi + + # insert toc file + if [[ "`uname`" == "Darwin" ]]; then + sed -i "" "/${ts}/r ${toc_path}" "$gh_src" + else + sed -i "/${ts}/r ${toc_path}" "$gh_src" + fi + echo + if [ "${no_backup}" = "yes" ]; then + rm ${toc_path} ${gh_src}${ext} + fi + echo "!! TOC was added into: '$gh_src'" + if [ -z "${no_backup}" ]; then + echo "!! Origin version of the file: '${gh_src}${ext}'" + echo "!! TOC added into a separate file: '${toc_path}'" + fi + echo + fi + fi +} + +# +# Grabber of the TOC from rendered html +# +# $1 - a source url of document. +# It's need if TOC is generated for multiple documents. +# +gh_toc_grab() { + common_awk_script=' + modified_href = "" + split(href, chars, "") + for (i=1;i <= length(href); i++) { + c = chars[i] + res = "" + if (c == "+") { + res = " " + } else { + if (c == "%") { + res = "\\x" + } else { + res = c "" + } + } + modified_href = modified_href res + } + print sprintf("%*s", (level-1)*3, "") "* [" text "](" gh_url modified_href ")" + ' + if [ `uname -s` == "OS/390" ]; then + grepcmd="pcregrep -o" + echoargs="" + awkscript='{ + level = substr($0, length($0), 1) + text = substr($0, match($0, /a>.*<\/h/)+2, RLENGTH-5) + href = substr($0, match($0, "href=\"([^\"]+)?\"")+6, RLENGTH-7) + '"$common_awk_script"' + }' + else + grepcmd="grep -Eo" + echoargs="-e" + awkscript='{ + level = substr($0, length($0), 1) + text = substr($0, match($0, /a>.*<\/h/)+2, RLENGTH-5) + href = substr($0, match($0, "href=\"[^\"]+?\"")+6, RLENGTH-7) + '"$common_awk_script"' + }' + fi + href_regex='href=\"[^\"]+?\"' + + # if closed is on the new line, then move it on the prev line + # for example: + # was: The command foo1 + # + # became: The command foo1 + sed -e ':a' -e 'N' -e '$!ba' -e 's/\n<\/h/<\/h/g' | + + # find strings that corresponds to template + $grepcmd '//g' | sed 's/<\/code>//g' | + + # remove g-emoji + sed 's/]*[^<]*<\/g-emoji> //g' | + + # now all rows are like: + # ... / placeholders" + echo " $app_name - Create TOC for markdown from STDIN" + echo " $app_name --help Show help" + echo " $app_name --version Show version" + return + fi + + if [ "$1" = '--version' ]; then + echo "$gh_toc_version" + echo + echo "os: `lsb_release -d | cut -f 2`" + echo "kernel: `cat /proc/version`" + echo "shell: `$SHELL --version`" + echo + for tool in curl wget grep awk sed; do + printf "%-5s: " $tool + echo `$tool --version | head -n 1` + done + return + fi + + if [ "$1" = "-" ]; then + if [ -z "$TMPDIR" ]; then + TMPDIR="/tmp" + elif [ -n "$TMPDIR" -a ! -d "$TMPDIR" ]; then + mkdir -p "$TMPDIR" + fi + local gh_tmp_md + if [ `uname -s` == "OS/390" ]; then + local timestamp=$(date +%m%d%Y%H%M%S) + gh_tmp_md="$TMPDIR/tmp.$timestamp" + else + gh_tmp_md=$(mktemp $TMPDIR/tmp.XXXXXX) + fi + while read input; do + echo "$input" >> "$gh_tmp_md" + done + gh_toc_md2html "$gh_tmp_md" | gh_toc_grab "" + return + fi + + if [ "$1" = '--insert' ]; then + need_replace="yes" + shift + fi + + if [ "$1" = '--no-backup' ]; then + need_replace="yes" + no_backup="yes" + shift + fi + + if [ "$1" = '--hide-footer' ]; then + need_replace="yes" + no_footer="yes" + shift + fi + + for md in "$@" + do + echo "" + gh_toc "$md" "$#" "$need_replace" "$no_backup" "$no_footer" + done + + echo "" + echo "Created by [gh-md-toc](https://github.com/ekalinin/github-markdown-toc)" +} + +# +# Entry point +# +gh_toc_app "$@" + From a442ddb1dc61154837d9945bff8a52345ae5fd53 Mon Sep 17 00:00:00 2001 From: Joel Labes Date: Mon, 31 Jan 2022 20:28:04 +1300 Subject: [PATCH 04/17] Fix indentation of schema tests --- README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 507a085e..d0ac554e 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -This [dbt](https://github.com/dbt-labs/dbt) package contains macros that can be (re)used across dbt projects. +This [dbt](https://github.com/dbt-labs/dbt-core) package contains macros that can be (re)used across dbt projects. ## Installation Instructions Check [dbt Hub](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) for the latest installation instructions, or [read the docs](https://docs.getdbt.com/docs/package-management) for more information on installing packages. @@ -9,6 +9,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this ---- ## Contents + * [Installation Instructions](#installation-instructions) * [Compatibility matrix](#compatibility-matrix) * [Contents](#contents) @@ -76,7 +77,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this --- -### Schema Tests +## Schema Tests #### equal_rowcount ([source](macros/schema_tests/equal_rowcount.sql)) This schema test asserts the that two relations have the same number of rows. From d3550772ce4bd52e03463898651b88d7e7216e3e Mon Sep 17 00:00:00 2001 From: joellabes Date: Mon, 31 Jan 2022 07:28:22 +0000 Subject: [PATCH 05/17] Auto update markdown TOC --- README.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index d0ac554e..c7584b68 100644 --- a/README.md +++ b/README.md @@ -9,11 +9,10 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this ---- ## Contents - * [Installation Instructions](#installation-instructions) * [Compatibility matrix](#compatibility-matrix) * [Contents](#contents) - * [Schema Tests](#schema-tests) + * [Schema Tests](#schema-tests) * [equal_rowcount (source)](#equal_rowcount-source) * [fewer_rows_than (source)](#fewer_rows_than-source) * [equality (source)](#equality-source) @@ -72,7 +71,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Getting started with dbt](#getting-started-with-dbt) * [Code of Conduct](#code-of-conduct) - + From 3b78b7b068019f0cd0fbdff0f7a523f3edebe704 Mon Sep 17 00:00:00 2001 From: Joel Labes Date: Mon, 31 Jan 2022 20:30:11 +1300 Subject: [PATCH 06/17] Outdent all schema tests by one --- README.md | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index d0ac554e..b5705596 100644 --- a/README.md +++ b/README.md @@ -78,7 +78,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this --- ## Schema Tests -#### equal_rowcount ([source](macros/schema_tests/equal_rowcount.sql)) +### equal_rowcount ([source](macros/schema_tests/equal_rowcount.sql)) This schema test asserts the that two relations have the same number of rows. **Usage:** @@ -93,7 +93,7 @@ models: ``` -#### fewer_rows_than ([source](macros/schema_tests/fewer_rows_than.sql)) +### fewer_rows_than ([source](macros/schema_tests/fewer_rows_than.sql)) This schema test asserts that this model has fewer rows than the referenced model. Usage: @@ -107,7 +107,7 @@ models: compare_model: ref('other_table_name') ``` -#### equality ([source](macros/schema_tests/equality.sql)) +### equality ([source](macros/schema_tests/equality.sql)) This schema test asserts the equality of two relations. Optionally specify a subset of columns to compare. **Usage:** @@ -124,7 +124,7 @@ models: - second_column ``` -#### expression_is_true ([source](macros/schema_tests/expression_is_true.sql)) +### expression_is_true ([source](macros/schema_tests/expression_is_true.sql)) This schema test asserts that a valid sql expression is true for all records. This is useful when checking integrity across columns, for example, that a total is equal to the sum of its parts, or that at least one column is true. **Usage:** @@ -172,7 +172,7 @@ models: condition: col_a = 1 ``` -#### recency ([source](macros/schema_tests/recency.sql)) +### recency ([source](macros/schema_tests/recency.sql)) This schema test asserts that there is data in the referenced model at least as recent as the defined interval prior to the current timestamp. **Usage:** @@ -188,7 +188,7 @@ models: interval: 1 ``` -#### at_least_one ([source](macros/schema_tests/at_least_one.sql)) +### at_least_one ([source](macros/schema_tests/at_least_one.sql)) This schema test asserts if column has at least one value. **Usage:** @@ -203,7 +203,7 @@ models: - dbt_utils.at_least_one ``` -#### not_constant ([source](macros/schema_tests/not_constant.sql)) +### not_constant ([source](macros/schema_tests/not_constant.sql)) This schema test asserts if column does not have same value in all rows. **Usage:** @@ -218,7 +218,7 @@ models: - dbt_utils.not_constant ``` -#### cardinality_equality ([source](macros/schema_tests/cardinality_equality.sql)) +### cardinality_equality ([source](macros/schema_tests/cardinality_equality.sql)) This schema test asserts if values in a given column have exactly the same cardinality as values from a different column in a different model. **Usage:** @@ -235,7 +235,7 @@ models: to: ref('other_model_name') ``` -#### unique_where ([source](macros/schema_tests/test_unique_where.sql)) +### unique_where ([source](macros/schema_tests/test_unique_where.sql)) This test validates that there are no duplicate values present in a field for a subset of rows by specifying a `where` clause. *Warning*: This test is no longer supported. Starting in dbt v0.20.0, the built-in `unique` test supports a `where` config. [See the dbt docs for more details](https://docs.getdbt.com/reference/resource-configs/where). @@ -253,7 +253,7 @@ models: where: "_deleted = false" ``` -#### not_null_where ([source](macros/schema_tests/test_not_null_where.sql)) +### not_null_where ([source](macros/schema_tests/test_not_null_where.sql)) This test validates that there are no null values present in a column for a subset of rows by specifying a `where` clause. *Warning*: This test is no longer supported. Starting in dbt v0.20.0, the built-in `not_null` test supports a `where` config. [See the dbt docs for more details](https://docs.getdbt.com/reference/resource-configs/where). @@ -271,7 +271,7 @@ models: where: "_deleted = false" ``` -#### not_null_proportion ([source](macros/schema_tests/not_null_proportion.sql)) +### not_null_proportion ([source](macros/schema_tests/not_null_proportion.sql)) This test validates that the proportion of non-null values present in a column is between a specified range [`at_least`, `at_most`] where `at_most` is an optional argument (default: `1.0`). **Usage:** @@ -287,7 +287,7 @@ models: at_least: 0.95 ``` -#### not_accepted_values ([source](macros/schema_tests/not_accepted_values.sql)) +### not_accepted_values ([source](macros/schema_tests/not_accepted_values.sql)) This test validates that there are no rows that match the given values. Usage: @@ -303,7 +303,7 @@ models: values: ['Barcelona', 'New York'] ``` -#### relationships_where ([source](macros/schema_tests/relationships_where.sql)) +### relationships_where ([source](macros/schema_tests/relationships_where.sql)) This test validates the referential integrity between two relations (same as the core relationships schema test) with an added predicate to filter out some rows from the test. This is useful to exclude records such as test entities, rows created in the last X minutes/hours to account for temporary gaps due to ETL limitations, etc. **Usage:** @@ -321,7 +321,7 @@ models: from_condition: id <> '4ca448b8-24bf-4b88-96c6-b1609499c38b' ``` -#### mutually_exclusive_ranges ([source](macros/schema_tests/mutually_exclusive_ranges.sql)) +### mutually_exclusive_ranges ([source](macros/schema_tests/mutually_exclusive_ranges.sql)) This test confirms that for a given lower_bound_column and upper_bound_column, the ranges of between the lower and upper bounds do not overlap with the ranges of another row. @@ -434,7 +434,7 @@ Here are a number of examples for each allowed `zero_length_range_allowed` argum | 2 | 2 | | 3 | 4 | -#### sequential_values ([source](macros/schema_tests/sequential_values.sql)) +### sequential_values ([source](macros/schema_tests/sequential_values.sql)) This test confirms that a column contains sequential values. It can be used for both numeric values, and datetime values, as follows: ```yml @@ -498,7 +498,7 @@ An optional `quote_columns` argument (`default=false`) can also be used if a col ``` -#### accepted_range ([source](macros/schema_tests/accepted_range.sql)) +### accepted_range ([source](macros/schema_tests/accepted_range.sql)) This test checks that a column's values fall inside an expected range. Any combination of `min_value` and `max_value` is allowed, and the range can be inclusive or exclusive. Provide a `where` argument to filter to specific records only. In addition to comparisons to a scalar value, you can also compare to another column's values. Any data type that supports the `>` or `<` operators can be compared, so you could also run tests like checking that all order dates are in the past. From 5580b76a3c77dbf48cade7a26a315398ec3b485f Mon Sep 17 00:00:00 2001 From: Joel Labes Date: Mon, 31 Jan 2022 20:31:04 +1300 Subject: [PATCH 07/17] tweak workflow file --- .github/workflows/create-table-of-contents.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/create-table-of-contents.yml b/.github/workflows/create-table-of-contents.yml index ef1a44fc..99dbd267 100644 --- a/.github/workflows/create-table-of-contents.yml +++ b/.github/workflows/create-table-of-contents.yml @@ -1,6 +1,6 @@ # This is a basic workflow to help you get started with Actions -name: CI +name: Update table of contents # Controls when the workflow will run on: @@ -20,4 +20,4 @@ jobs: ./gh-md-toc --insert --no-backup README.md - uses: stefanzweifel/git-auto-commit-action@v4 with: - commit_message: Auto update markdown TOC + commit_message: Auto update table of contents From bde892f1a77cd2998b8f29639f7788c61e403f8e Mon Sep 17 00:00:00 2001 From: joellabes Date: Mon, 31 Jan 2022 07:31:18 +0000 Subject: [PATCH 08/17] Auto update table of contents --- README.md | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index 48f0a611..c0fec34c 100644 --- a/README.md +++ b/README.md @@ -13,23 +13,23 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Compatibility matrix](#compatibility-matrix) * [Contents](#contents) * [Schema Tests](#schema-tests) - * [equal_rowcount (source)](#equal_rowcount-source) - * [fewer_rows_than (source)](#fewer_rows_than-source) - * [equality (source)](#equality-source) - * [expression_is_true (source)](#expression_is_true-source) - * [recency (source)](#recency-source) - * [at_least_one (source)](#at_least_one-source) - * [not_constant (source)](#not_constant-source) - * [cardinality_equality (source)](#cardinality_equality-source) - * [unique_where (source)](#unique_where-source) - * [not_null_where (source)](#not_null_where-source) - * [not_null_proportion (source)](#not_null_proportion-source) - * [not_accepted_values (source)](#not_accepted_values-source) - * [relationships_where (source)](#relationships_where-source) - * [mutually_exclusive_ranges (source)](#mutually_exclusive_ranges-source) - * [sequential_values (source)](#sequential_values-source) + * [equal_rowcount (source)](#equal_rowcount-source) + * [fewer_rows_than (source)](#fewer_rows_than-source) + * [equality (source)](#equality-source) + * [expression_is_true (source)](#expression_is_true-source) + * [recency (source)](#recency-source) + * [at_least_one (source)](#at_least_one-source) + * [not_constant (source)](#not_constant-source) + * [cardinality_equality (source)](#cardinality_equality-source) + * [unique_where (source)](#unique_where-source) + * [not_null_where (source)](#not_null_where-source) + * [not_null_proportion (source)](#not_null_proportion-source) + * [not_accepted_values (source)](#not_accepted_values-source) + * [relationships_where (source)](#relationships_where-source) + * [mutually_exclusive_ranges (source)](#mutually_exclusive_ranges-source) + * [sequential_values (source)](#sequential_values-source) * [unique_combination_of_columns (source)](#unique_combination_of_columns-source) - * [accepted_range (source)](#accepted_range-source) + * [accepted_range (source)](#accepted_range-source) * [Macros](#macros) * [Introspective macros](#introspective-macros) * [get_column_values (source)](#get_column_values-source) @@ -71,7 +71,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Getting started with dbt](#getting-started-with-dbt) * [Code of Conduct](#code-of-conduct) - + From 6f197940127c9479640e91945ac3190e0ca9a33a Mon Sep 17 00:00:00 2001 From: Joel Labes Date: Mon, 31 Jan 2022 20:32:27 +1300 Subject: [PATCH 09/17] missed one --- README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/README.md b/README.md index 48f0a611..4e356e91 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,6 @@ Check [dbt Hub](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) for the lates For compatibility details between versions of dbt-core and dbt-utils, [see this spreadsheet](https://docs.google.com/spreadsheets/d/1RoDdC69auAtrwiqmkRsgcFdZ3MdNpeKcJrWkmEpXVIs/edit#gid=0). ---- -## Contents * [Installation Instructions](#installation-instructions) * [Compatibility matrix](#compatibility-matrix) @@ -461,7 +460,7 @@ seeds: * `interval` (default=1): The gap between two sequential values * `datepart` (default=None): Used when the gaps are a unit of time. If omitted, the test will check for a numeric gap. -#### unique_combination_of_columns ([source](macros/schema_tests/unique_combination_of_columns.sql)) +### unique_combination_of_columns ([source](macros/schema_tests/unique_combination_of_columns.sql)) This test confirms that the combination of columns is unique. For example, the combination of month and product is unique, however neither column is unique in isolation. From 0c673408cd4960228ad38e9d4e9ac7a432ef0d18 Mon Sep 17 00:00:00 2001 From: joellabes Date: Mon, 31 Jan 2022 07:32:55 +0000 Subject: [PATCH 10/17] Auto update table of contents --- README.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index c4d96173..7f564d4f 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,6 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Installation Instructions](#installation-instructions) * [Compatibility matrix](#compatibility-matrix) - * [Contents](#contents) * [Schema Tests](#schema-tests) * [equal_rowcount (source)](#equal_rowcount-source) * [fewer_rows_than (source)](#fewer_rows_than-source) @@ -27,7 +26,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [relationships_where (source)](#relationships_where-source) * [mutually_exclusive_ranges (source)](#mutually_exclusive_ranges-source) * [sequential_values (source)](#sequential_values-source) - * [unique_combination_of_columns (source)](#unique_combination_of_columns-source) + * [unique_combination_of_columns (source)](#unique_combination_of_columns-source) * [accepted_range (source)](#accepted_range-source) * [Macros](#macros) * [Introspective macros](#introspective-macros) @@ -70,7 +69,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Getting started with dbt](#getting-started-with-dbt) * [Code of Conduct](#code-of-conduct) - + From 0c32bf3234e8e22a6d3360a88cb866041ae5c42c Mon Sep 17 00:00:00 2001 From: Joel Labes Date: Mon, 31 Jan 2022 20:37:33 +1300 Subject: [PATCH 11/17] Move stuff around --- README.md | 165 +++++++++++++++++++++++++++--------------------------- 1 file changed, 83 insertions(+), 82 deletions(-) diff --git a/README.md b/README.md index 7f564d4f..1cc0d22e 100644 --- a/README.md +++ b/README.md @@ -10,6 +10,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Installation Instructions](#installation-instructions) * [Compatibility matrix](#compatibility-matrix) + * [Contents](#contents) * [Schema Tests](#schema-tests) * [equal_rowcount (source)](#equal_rowcount-source) * [fewer_rows_than (source)](#fewer_rows_than-source) @@ -26,7 +27,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [relationships_where (source)](#relationships_where-source) * [mutually_exclusive_ranges (source)](#mutually_exclusive_ranges-source) * [sequential_values (source)](#sequential_values-source) - * [unique_combination_of_columns (source)](#unique_combination_of_columns-source) + * [unique_combination_of_columns (source)](#unique_combination_of_columns-source) * [accepted_range (source)](#accepted_range-source) * [Macros](#macros) * [Introspective macros](#introspective-macros) @@ -69,7 +70,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Getting started with dbt](#getting-started-with-dbt) * [Code of Conduct](#code-of-conduct) - + @@ -931,12 +932,79 @@ This macro extracts a page path from a column containing a url. ``` {{ dbt_utils.get_url_path(field='page_url') }} ``` + +--- +### Jinja Helpers +#### pretty_time ([source](macros/jinja_helpers/pretty_time.sql)) +This macro returns a string of the current timestamp, optionally taking a datestring format. +```sql +{#- This will return a string like '14:50:34' -#} +{{ dbt_utils.pretty_time() }} + +{#- This will return a string like '2019-05-02 14:50:34' -#} +{{ dbt_utils.pretty_time(format='%Y-%m-%d %H:%M:%S') }} +``` + +#### pretty_log_format ([source](macros/jinja_helpers/pretty_log_format.sql)) +This macro formats the input in a way that will print nicely to the command line when you `log` it. +```sql +{#- This will return a string like: +"11:07:31 + my pretty message" +-#} + +{{ dbt_utils.pretty_log_format("my pretty message") }} +``` +#### log_info ([source](macros/jinja_helpers/log_info.sql)) +This macro logs a formatted message (with a timestamp) to the command line. +```sql +{{ dbt_utils.log_info("my pretty message") }} +``` + +``` +11:07:28 | 1 of 1 START table model analytics.fct_orders........................ [RUN] +11:07:31 + my pretty message +``` + +#### slugify ([source](macros/jinja_helpers/slugify.sql)) +This macro is useful for transforming Jinja strings into "slugs", and can be useful when using a Jinja object as a column name, especially when that Jinja object is not hardcoded. + +For this example, let's pretend that we have payment methods in our payments table like `['venmo App', 'ca$h-money']`, which we can't use as a column name due to the spaces and special characters. This macro does its best to strip those out in a sensible way: `['venmo_app', +'cah_money']`. + +```sql +{%- set payment_methods = dbt_utils.get_column_values( + table=ref('raw_payments'), + column='payment_method' +) -%} + +select +order_id, +{%- for payment_method in payment_methods %} +sum(case when payment_method = '{{ payment_method }}' then amount end) + as {{ dbt_utils.slugify(payment_method) }}_amount, + +{% endfor %} +... +``` + +```sql +select +order_id, + +sum(case when payment_method = 'Venmo App' then amount end) + as venmo_app_amount, + +sum(case when payment_method = 'ca$h money' then amount end) + as cah_money_amount, +... +``` + ---- -### Cross-database macros +## Cross-database macros These macros make it easier for package authors (especially those writing modeling packages) to implement cross-database compatibility. In general, you should not use these macros in your own dbt project (unless it is a package) -#### current_timestamp ([source](macros/cross_db_utils/current_timestamp.sql)) +### current_timestamp ([source](macros/cross_db_utils/current_timestamp.sql)) This macro returns the current timestamp. **Usage:** @@ -944,7 +1012,7 @@ This macro returns the current timestamp. {{ dbt_utils.current_timestamp() }} ``` -#### dateadd ([source](macros/cross_db_utils/dateadd.sql)) +### dateadd ([source](macros/cross_db_utils/dateadd.sql)) This macro adds a time/day interval to the supplied date/timestamp. Note: The `datepart` argument is database-specific. **Usage:** @@ -952,7 +1020,7 @@ This macro adds a time/day interval to the supplied date/timestamp. Note: The `d {{ dbt_utils.dateadd(datepart='day', interval=1, from_date_or_timestamp="'2017-01-01'") }} ``` -#### datediff ([source](macros/cross_db_utils/datediff.sql)) +### datediff ([source](macros/cross_db_utils/datediff.sql)) This macro calculates the difference between two dates. **Usage:** @@ -960,7 +1028,7 @@ This macro calculates the difference between two dates. {{ dbt_utils.datediff("'2018-01-01'", "'2018-01-20'", 'day') }} ``` -#### split_part ([source](macros/cross_db_utils/split_part.sql)) +### split_part ([source](macros/cross_db_utils/split_part.sql)) This macro splits a string of text using the supplied delimiter and returns the supplied part number (1-indexed). **Usage:** @@ -968,7 +1036,7 @@ This macro splits a string of text using the supplied delimiter and returns the {{ dbt_utils.split_part(string_text='1,2,3', delimiter_text=',', part_number=1) }} ``` -#### date_trunc ([source](macros/cross_db_utils/date_trunc.sql)) +### date_trunc ([source](macros/cross_db_utils/date_trunc.sql)) Truncates a date or timestamp to the specified datepart. Note: The `datepart` argument is database-specific. **Usage:** @@ -976,7 +1044,7 @@ Truncates a date or timestamp to the specified datepart. Note: The `datepart` ar {{ dbt_utils.date_trunc(datepart, date) }} ``` -#### last_day ([source](macros/cross_db_utils/last_day.sql)) +### last_day ([source](macros/cross_db_utils/last_day.sql)) Gets the last day for a given date and datepart. Notes: - The `datepart` argument is database-specific. @@ -987,7 +1055,7 @@ Gets the last day for a given date and datepart. Notes: {{ dbt_utils.last_day(date, datepart) }} ``` -#### width_bucket ([source](macros/cross_db_utils/width_bucket.sql)) +### width_bucket ([source](macros/cross_db_utils/width_bucket.sql)) This macro is modeled after the `width_bucket` function natively available in Snowflake. From the original Snowflake [documentation](https://docs.snowflake.net/manuals/sql-reference/functions/width_bucket.html): @@ -1012,75 +1080,8 @@ When an expression falls outside the range, the function returns: {{ dbt_utils.width_bucket(expr, min_value, max_value, num_buckets) }} ``` - ---- -### Jinja Helpers -#### pretty_time ([source](macros/jinja_helpers/pretty_time.sql)) -This macro returns a string of the current timestamp, optionally taking a datestring format. -```sql -{#- This will return a string like '14:50:34' -#} -{{ dbt_utils.pretty_time() }} - -{#- This will return a string like '2019-05-02 14:50:34' -#} -{{ dbt_utils.pretty_time(format='%Y-%m-%d %H:%M:%S') }} -``` - -#### pretty_log_format ([source](macros/jinja_helpers/pretty_log_format.sql)) -This macro formats the input in a way that will print nicely to the command line when you `log` it. -```sql -{#- This will return a string like: -"11:07:31 + my pretty message" --#} - -{{ dbt_utils.pretty_log_format("my pretty message") }} -``` -#### log_info ([source](macros/jinja_helpers/log_info.sql)) -This macro logs a formatted message (with a timestamp) to the command line. -```sql -{{ dbt_utils.log_info("my pretty message") }} -``` - -``` -11:07:28 | 1 of 1 START table model analytics.fct_orders........................ [RUN] -11:07:31 + my pretty message -``` - -#### slugify ([source](macros/jinja_helpers/slugify.sql)) -This macro is useful for transforming Jinja strings into "slugs", and can be useful when using a Jinja object as a column name, especially when that Jinja object is not hardcoded. - -For this example, let's pretend that we have payment methods in our payments table like `['venmo App', 'ca$h-money']`, which we can't use as a column name due to the spaces and special characters. This macro does its best to strip those out in a sensible way: `['venmo_app', -'cah_money']`. - -```sql -{%- set payment_methods = dbt_utils.get_column_values( - table=ref('raw_payments'), - column='payment_method' -) -%} - -select -order_id, -{%- for payment_method in payment_methods %} -sum(case when payment_method = '{{ payment_method }}' then amount end) - as {{ dbt_utils.slugify(payment_method) }}_amount, - -{% endfor %} -... -``` - -```sql -select -order_id, - -sum(case when payment_method = 'Venmo App' then amount end) - as venmo_app_amount, - -sum(case when payment_method = 'ca$h money' then amount end) - as cah_money_amount, -... -``` - -### Materializations -#### insert_by_period ([source](macros/materializations/insert_by_period_materialization.sql)) +## Materializations +### insert_by_period ([source](macros/materializations/insert_by_period_materialization.sql)) `insert_by_period` allows dbt to insert records into a table one period (i.e. day, week) at a time. This materialization is appropriate for event data that can be processed in discrete periods. It is similar in concept to the built-in incremental materialization, but has the added benefit of building the model in chunks even during a full-refresh so is particularly useful for models where the initial run can be problematic. @@ -1135,13 +1136,13 @@ A useful workaround is to change the above post-hook to: ---- -### Contributing +## Contributing We welcome contributions to this repo! To contribute a new feature or a fix, please open a Pull Request with 1) your changes, 2) updated documentation for the `README.md` file, and 3) a working integration test. See [this page](integration_tests/README.md) for more information. ---- -### Dispatch macros +## Dispatch macros **Note:** This is primarily relevant to: - Users and maintainers of community-supported [adapter plugins](https://docs.getdbt.com/docs/available-adapters) @@ -1177,7 +1178,7 @@ dbt_utils.default__datediff ---- -### Getting started with dbt +## Getting started with dbt - [What is dbt](https://docs.getdbt.com/docs/introduction)? - Read the [dbt viewpoint](https://docs.getdbt.com/docs/about/viewpoint) From 9df3bf15eb358f2859ace92e95514d299b61839e Mon Sep 17 00:00:00 2001 From: joellabes Date: Mon, 31 Jan 2022 07:37:48 +0000 Subject: [PATCH 12/17] Auto update table of contents --- README.md | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index 1cc0d22e..7f67a04e 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,6 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Installation Instructions](#installation-instructions) * [Compatibility matrix](#compatibility-matrix) - * [Contents](#contents) * [Schema Tests](#schema-tests) * [equal_rowcount (source)](#equal_rowcount-source) * [fewer_rows_than (source)](#fewer_rows_than-source) @@ -27,7 +26,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [relationships_where (source)](#relationships_where-source) * [mutually_exclusive_ranges (source)](#mutually_exclusive_ranges-source) * [sequential_values (source)](#sequential_values-source) - * [unique_combination_of_columns (source)](#unique_combination_of_columns-source) + * [unique_combination_of_columns (source)](#unique_combination_of_columns-source) * [accepted_range (source)](#accepted_range-source) * [Macros](#macros) * [Introspective macros](#introspective-macros) @@ -50,27 +49,27 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [get_url_parameter (source)](#get_url_parameter-source) * [get_url_host (source)](#get_url_host-source) * [get_url_path (source)](#get_url_path-source) - * [Cross-database macros](#cross-database-macros) - * [current_timestamp (source)](#current_timestamp-source) - * [dateadd (source)](#dateadd-source) - * [datediff (source)](#datediff-source) - * [split_part (source)](#split_part-source) - * [date_trunc (source)](#date_trunc-source) - * [last_day (source)](#last_day-source) - * [width_bucket (source)](#width_bucket-source) * [Jinja Helpers](#jinja-helpers) * [pretty_time (source)](#pretty_time-source) * [pretty_log_format (source)](#pretty_log_format-source) * [log_info (source)](#log_info-source) * [slugify (source)](#slugify-source) - * [Materializations](#materializations) - * [insert_by_period (source)](#insert_by_period-source) - * [Contributing](#contributing) - * [Dispatch macros](#dispatch-macros) - * [Getting started with dbt](#getting-started-with-dbt) + * [Cross-database macros](#cross-database-macros) + * [current_timestamp (source)](#current_timestamp-source) + * [dateadd (source)](#dateadd-source) + * [datediff (source)](#datediff-source) + * [split_part (source)](#split_part-source) + * [date_trunc (source)](#date_trunc-source) + * [last_day (source)](#last_day-source) + * [width_bucket (source)](#width_bucket-source) + * [Materializations](#materializations) + * [insert_by_period (source)](#insert_by_period-source) + * [Contributing](#contributing) + * [Dispatch macros](#dispatch-macros) + * [Getting started with dbt](#getting-started-with-dbt) * [Code of Conduct](#code-of-conduct) - + From 6ae44d633c30583a7a2acfda269100b7ef478968 Mon Sep 17 00:00:00 2001 From: Joel Labes Date: Mon, 31 Jan 2022 20:45:28 +1300 Subject: [PATCH 13/17] Cleanup after self --- .github/workflows/create-table-of-contents.yml | 1 + README.md | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/.github/workflows/create-table-of-contents.yml b/.github/workflows/create-table-of-contents.yml index 99dbd267..08d8f5fd 100644 --- a/.github/workflows/create-table-of-contents.yml +++ b/.github/workflows/create-table-of-contents.yml @@ -18,6 +18,7 @@ jobs: curl https://raw.githubusercontent.com/ekalinin/github-markdown-toc/master/gh-md-toc -o gh-md-toc chmod a+x gh-md-toc ./gh-md-toc --insert --no-backup README.md + rm ./gh-md-toc - uses: stefanzweifel/git-auto-commit-action@v4 with: commit_message: Auto update table of contents diff --git a/README.md b/README.md index 1cc0d22e..7225e58d 100644 --- a/README.md +++ b/README.md @@ -1073,7 +1073,7 @@ Notes: When an expression falls outside the range, the function returns: - `0` if the expression is less than min_value. - `num_buckets + 1` if the expression is greater than or equal to max_value. - + **Usage:** ``` From 4f28014b00f5ea8d746685e94b9415e4ba4023db Mon Sep 17 00:00:00 2001 From: joellabes Date: Mon, 31 Jan 2022 07:45:58 +0000 Subject: [PATCH 14/17] Auto update table of contents --- README.md | 2 +- gh-md-toc | 361 ------------------------------------------------------ 2 files changed, 1 insertion(+), 362 deletions(-) delete mode 100755 gh-md-toc diff --git a/README.md b/README.md index 432b9218..bcab82aa 100644 --- a/README.md +++ b/README.md @@ -69,7 +69,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Getting started with dbt](#getting-started-with-dbt) * [Code of Conduct](#code-of-conduct) - + diff --git a/gh-md-toc b/gh-md-toc deleted file mode 100755 index ac5e183f..00000000 --- a/gh-md-toc +++ /dev/null @@ -1,361 +0,0 @@ -#!/usr/bin/env bash - -# -# Steps: -# -# 1. Download corresponding html file for some README.md: -# curl -s $1 -# -# 2. Discard rows where no substring 'user-content-' (github's markup): -# awk '/user-content-/ { ... -# -# 3.1 Get last number in each row like ' ... sitemap.js.*<\/h/)+2, RLENGTH-5) -# -# 5. Find anchor and insert it inside "(...)": -# substr($0, match($0, "href=\"[^\"]+?\" ")+6, RLENGTH-8) -# - -gh_toc_version="0.8.0" - -gh_user_agent="gh-md-toc v$gh_toc_version" - -# -# Download rendered into html README.md by its url. -# -# -gh_toc_load() { - local gh_url=$1 - - if type curl &>/dev/null; then - curl --user-agent "$gh_user_agent" -s "$gh_url" - elif type wget &>/dev/null; then - wget --user-agent="$gh_user_agent" -qO- "$gh_url" - else - echo "Please, install 'curl' or 'wget' and try again." - exit 1 - fi -} - -# -# Converts local md file into html by GitHub -# -# -> curl -X POST --data '{"text": "Hello world github/linguist#1 **cool**, and #1!"}' https://api.github.com/markdown -#

Hello world github/linguist#1 cool, and #1!

'" -gh_toc_md2html() { - local gh_file_md=$1 - URL=https://api.github.com/markdown/raw - - if [ ! -z "$GH_TOC_TOKEN" ]; then - TOKEN=$GH_TOC_TOKEN - else - TOKEN_FILE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/token.txt" - if [ -f "$TOKEN_FILE" ]; then - TOKEN="$(cat $TOKEN_FILE)" - fi - fi - if [ ! -z "${TOKEN}" ]; then - AUTHORIZATION="Authorization: token ${TOKEN}" - fi - - # echo $URL 1>&2 - OUTPUT=$(curl -s \ - --user-agent "$gh_user_agent" \ - --data-binary @"$gh_file_md" \ - -H "Content-Type:text/plain" \ - -H "$AUTHORIZATION" \ - "$URL") - - if [ "$?" != "0" ]; then - echo "XXNetworkErrorXX" - fi - if [ "$(echo "${OUTPUT}" | awk '/API rate limit exceeded/')" != "" ]; then - echo "XXRateLimitXX" - else - echo "${OUTPUT}" - fi -} - - -# -# Is passed string url -# -gh_is_url() { - case $1 in - https* | http*) - echo "yes";; - *) - echo "no";; - esac -} - -# -# TOC generator -# -gh_toc(){ - local gh_src=$1 - local gh_src_copy=$1 - local gh_ttl_docs=$2 - local need_replace=$3 - local no_backup=$4 - local no_footer=$5 - - if [ "$gh_src" = "" ]; then - echo "Please, enter URL or local path for a README.md" - exit 1 - fi - - - # Show "TOC" string only if working with one document - if [ "$gh_ttl_docs" = "1" ]; then - - echo "Table of Contents" - echo "=================" - echo "" - gh_src_copy="" - - fi - - if [ "$(gh_is_url "$gh_src")" == "yes" ]; then - gh_toc_load "$gh_src" | gh_toc_grab "$gh_src_copy" - if [ "${PIPESTATUS[0]}" != "0" ]; then - echo "Could not load remote document." - echo "Please check your url or network connectivity" - exit 1 - fi - if [ "$need_replace" = "yes" ]; then - echo - echo "!! '$gh_src' is not a local file" - echo "!! Can't insert the TOC into it." - echo - fi - else - local rawhtml=$(gh_toc_md2html "$gh_src") - if [ "$rawhtml" == "XXNetworkErrorXX" ]; then - echo "Parsing local markdown file requires access to github API" - echo "Please make sure curl is installed and check your network connectivity" - exit 1 - fi - if [ "$rawhtml" == "XXRateLimitXX" ]; then - echo "Parsing local markdown file requires access to github API" - echo "Error: You exceeded the hourly limit. See: https://developer.github.com/v3/#rate-limiting" - TOKEN_FILE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/token.txt" - echo "or place GitHub auth token here: ${TOKEN_FILE}" - exit 1 - fi - local toc=`echo "$rawhtml" | gh_toc_grab "$gh_src_copy"` - echo "$toc" - if [ "$need_replace" = "yes" ]; then - if grep -Fxq "" $gh_src && grep -Fxq "" $gh_src; then - echo "Found markers" - else - echo "You don't have or in your file...exiting" - exit 1 - fi - local ts="<\!--ts-->" - local te="<\!--te-->" - local dt=`date +'%F_%H%M%S'` - local ext=".orig.${dt}" - local toc_path="${gh_src}.toc.${dt}" - local toc_footer="" - # http://fahdshariff.blogspot.ru/2012/12/sed-mutli-line-replacement-between-two.html - # clear old TOC - sed -i${ext} "/${ts}/,/${te}/{//!d;}" "$gh_src" - # create toc file - echo "${toc}" > "${toc_path}" - if [ "${no_footer}" != "yes" ]; then - echo -e "\n${toc_footer}\n" >> "$toc_path" - fi - - # insert toc file - if [[ "`uname`" == "Darwin" ]]; then - sed -i "" "/${ts}/r ${toc_path}" "$gh_src" - else - sed -i "/${ts}/r ${toc_path}" "$gh_src" - fi - echo - if [ "${no_backup}" = "yes" ]; then - rm ${toc_path} ${gh_src}${ext} - fi - echo "!! TOC was added into: '$gh_src'" - if [ -z "${no_backup}" ]; then - echo "!! Origin version of the file: '${gh_src}${ext}'" - echo "!! TOC added into a separate file: '${toc_path}'" - fi - echo - fi - fi -} - -# -# Grabber of the TOC from rendered html -# -# $1 - a source url of document. -# It's need if TOC is generated for multiple documents. -# -gh_toc_grab() { - common_awk_script=' - modified_href = "" - split(href, chars, "") - for (i=1;i <= length(href); i++) { - c = chars[i] - res = "" - if (c == "+") { - res = " " - } else { - if (c == "%") { - res = "\\x" - } else { - res = c "" - } - } - modified_href = modified_href res - } - print sprintf("%*s", (level-1)*3, "") "* [" text "](" gh_url modified_href ")" - ' - if [ `uname -s` == "OS/390" ]; then - grepcmd="pcregrep -o" - echoargs="" - awkscript='{ - level = substr($0, length($0), 1) - text = substr($0, match($0, /a>.*<\/h/)+2, RLENGTH-5) - href = substr($0, match($0, "href=\"([^\"]+)?\"")+6, RLENGTH-7) - '"$common_awk_script"' - }' - else - grepcmd="grep -Eo" - echoargs="-e" - awkscript='{ - level = substr($0, length($0), 1) - text = substr($0, match($0, /a>.*<\/h/)+2, RLENGTH-5) - href = substr($0, match($0, "href=\"[^\"]+?\"")+6, RLENGTH-7) - '"$common_awk_script"' - }' - fi - href_regex='href=\"[^\"]+?\"' - - # if closed is on the new line, then move it on the prev line - # for example: - # was: The command foo1 - # - # became: The command foo1 - sed -e ':a' -e 'N' -e '$!ba' -e 's/\n<\/h/<\/h/g' | - - # find strings that corresponds to template - $grepcmd '//g' | sed 's/<\/code>//g' | - - # remove g-emoji - sed 's/]*[^<]*<\/g-emoji> //g' | - - # now all rows are like: - # ... / placeholders" - echo " $app_name - Create TOC for markdown from STDIN" - echo " $app_name --help Show help" - echo " $app_name --version Show version" - return - fi - - if [ "$1" = '--version' ]; then - echo "$gh_toc_version" - echo - echo "os: `lsb_release -d | cut -f 2`" - echo "kernel: `cat /proc/version`" - echo "shell: `$SHELL --version`" - echo - for tool in curl wget grep awk sed; do - printf "%-5s: " $tool - echo `$tool --version | head -n 1` - done - return - fi - - if [ "$1" = "-" ]; then - if [ -z "$TMPDIR" ]; then - TMPDIR="/tmp" - elif [ -n "$TMPDIR" -a ! -d "$TMPDIR" ]; then - mkdir -p "$TMPDIR" - fi - local gh_tmp_md - if [ `uname -s` == "OS/390" ]; then - local timestamp=$(date +%m%d%Y%H%M%S) - gh_tmp_md="$TMPDIR/tmp.$timestamp" - else - gh_tmp_md=$(mktemp $TMPDIR/tmp.XXXXXX) - fi - while read input; do - echo "$input" >> "$gh_tmp_md" - done - gh_toc_md2html "$gh_tmp_md" | gh_toc_grab "" - return - fi - - if [ "$1" = '--insert' ]; then - need_replace="yes" - shift - fi - - if [ "$1" = '--no-backup' ]; then - need_replace="yes" - no_backup="yes" - shift - fi - - if [ "$1" = '--hide-footer' ]; then - need_replace="yes" - no_footer="yes" - shift - fi - - for md in "$@" - do - echo "" - gh_toc "$md" "$#" "$need_replace" "$no_backup" "$no_footer" - done - - echo "" - echo "Created by [gh-md-toc](https://github.com/ekalinin/github-markdown-toc)" -} - -# -# Entry point -# -gh_toc_app "$@" - From 0de7a14079e21022b5971da8e020086eed556c85 Mon Sep 17 00:00:00 2001 From: Joel Labes Date: Mon, 31 Jan 2022 21:21:38 +1300 Subject: [PATCH 15/17] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index bcab82aa..19ffe579 100644 --- a/README.md +++ b/README.md @@ -7,6 +7,7 @@ Check [dbt Hub](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) for the lates For compatibility details between versions of dbt-core and dbt-utils, [see this spreadsheet](https://docs.google.com/spreadsheets/d/1RoDdC69auAtrwiqmkRsgcFdZ3MdNpeKcJrWkmEpXVIs/edit#gid=0). ---- + * [Installation Instructions](#installation-instructions) * [Compatibility matrix](#compatibility-matrix) From ece950e7f6399108771fdcc0adeea014dc5688f3 Mon Sep 17 00:00:00 2001 From: joellabes Date: Mon, 31 Jan 2022 08:21:54 +0000 Subject: [PATCH 16/17] Auto update table of contents --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 19ffe579..eea03b1a 100644 --- a/README.md +++ b/README.md @@ -70,7 +70,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this * [Getting started with dbt](#getting-started-with-dbt) * [Code of Conduct](#code-of-conduct) - + From 80bb1885324795fe0435e419f92486cd1d1169df Mon Sep 17 00:00:00 2001 From: joellabes Date: Thu, 9 Feb 2023 00:20:44 +0000 Subject: [PATCH 17/17] Auto update table of contents --- README.md | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/README.md b/README.md index 142c4d4a..e9d6e7e9 100644 --- a/README.md +++ b/README.md @@ -8,6 +8,67 @@ Check [dbt Hub](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) for the lates + * [Installation Instructions](#installation-instructions) + * [Generic Tests](#generic-tests) + * [equal_rowcount (source)](#equal_rowcount-source) + * [fewer_rows_than (source)](#fewer_rows_than-source) + * [equality (source)](#equality-source) + * [expression_is_true (source)](#expression_is_true-source) + * [recency (source)](#recency-source) + * [at_least_one (source)](#at_least_one-source) + * [not_constant (source)](#not_constant-source) + * [not_empty_string (source)](#not_empty_string-source) + * [cardinality_equality (source)](#cardinality_equality-source) + * [not_null_proportion (source)](#not_null_proportion-source) + * [not_accepted_values (source)](#not_accepted_values-source) + * [relationships_where (source)](#relationships_where-source) + * [mutually_exclusive_ranges (source)](#mutually_exclusive_ranges-source) + * [sequential_values (source)](#sequential_values-source) + * [unique_combination_of_columns (source)](#unique_combination_of_columns-source) + * [accepted_range (source)](#accepted_range-source) + * [Grouping in tests](#grouping-in-tests) + * [Macros](#macros) + * [Introspective macros](#introspective-macros) + * [get_column_values (source)](#get_column_values-source) + * [get_filtered_columns_in_relation (source)](#get_filtered_columns_in_relation-source) + * [get_relations_by_pattern (source)](#get_relations_by_pattern-source) + * [get_relations_by_prefix (source)](#get_relations_by_prefix-source) + * [get_query_results_as_dict (source)](#get_query_results_as_dict-source) + * [get_single_value (source)](#get_single_value-source) + * [SQL generators](#sql-generators) + * [date_spine (source)](#date_spine-source) + * [deduplicate (source)](#deduplicate-source) + * [haversine_distance (source)](#haversine_distance-source) + * [group_by (source)](#group_by-source) + * [star (source)](#star-source) + * [union_relations (source)](#union_relations-source) + * [generate_series (source)](#generate_series-source) + * [generate_surrogate_key (source)](#generate_surrogate_key-source) + * [safe_add (source)](#safe_add-source) + * [safe_divide (source)](#safe_divide-source) + * [safe_subtract (source)](#safe_subtract-source) + * [pivot (source)](#pivot-source) + * [unpivot (source)](#unpivot-source) + * [width_bucket (source)](#width_bucket-source) + * [Web macros](#web-macros) + * [get_url_parameter (source)](#get_url_parameter-source) + * [get_url_host (source)](#get_url_host-source) + * [get_url_path (source)](#get_url_path-source) + * [Cross-database macros](#cross-database-macros) + * [Jinja Helpers](#jinja-helpers) + * [pretty_time (source)](#pretty_time-source) + * [pretty_log_format (source)](#pretty_log_format-source) + * [log_info (source)](#log_info-source) + * [slugify (source)](#slugify-source) + * [Materializations](#materializations) + * [insert_by_period](#insert_by_period) + * [Reporting bugs and contributing code](#reporting-bugs-and-contributing-code) + * [Dispatch macros](#dispatch-macros) + * [Getting started with dbt](#getting-started-with-dbt) + * [Code of Conduct](#code-of-conduct) + + + ----