Skip to content

Releases: aliyun/dbt-maxcompute

dbt-maxcompute v1.9.0-a1

15 Jan 09:25
Compare
Choose a tag to compare

New Features

Raw Materialization Enhancements

  • Added new configuration options:

    • schema: Specifies the default schema for the current query
    • sql_hints: Allows users to specify SQL hints for the current query

    Note: Other materialization methods only support schema specification in the dbt.yaml file

Example Usage:

{{ config(
    materialized="raw",
    schema="new_schema",
    sql_hints={"odps.sql.submit.mode": "script"}
) }}

create schema new_schema;
create table table_in_new_schema(c1 bigint);

Bug Fixes

  • Fixed logical issues in the bq_insert_overwrite incremental materialization strategy and improved execution efficiency

v1.9.0-a0

03 Jan 03:41
60faed5
Compare
Choose a tag to compare

dbt-maxcompute 1.9.0a0

New Features

  • Compatibility with dbt-core 1.9.x: This version has been adapted to work seamlessly with dbt-core version 1.9.x.

  • New Incremental Materialization Strategies:

    • microbatch
    • bq_insert_overwrite
    • bq_microbatch
  • New Materialization Method:

    • raw: A new materialization method that supports manual SQL execution by users.

Note

Due to the distinct nature of the incremental materialization methods offered by dbt-bigquery, the micro provided in the dbt-adapters has been deeply customized to reduce the costs associated with using incremental materialization on partitioned tables. As a result, bq_insert_overwrite and bq_microbatch have been added to mimic the materialization method found in dbt-bigquery.

Key Differences

  • insert_overwrite: This method removes duplicates based on specified unique_keys.
  • bq_insert_overwrite: This method removes duplicates based on the partition_by condition and allows for specifying which partitions to work with.

Usage Examples

bq_insert_overwrite (Static)

{{ config(
    materialized='incremental',
    partition_by={"fields": "some_date", "data_types": "timestamp"},
    partitions=["TIMESTAMP'2024-10-10 00:00:00'", "TIMESTAMP'2025-01-01 00:00:00'"],
    incremental_strategy='bq_insert_overwrite'
) }}
select * from {{ source('raw', 'seed') }}
{% if is_incremental() %}
   where {{ dbt.datediff("some_date", "TIMESTAMP'2000-01-01 00:00:00'", 'day') }} > 0
{% endif %}

bq_insert_overwrite (Dynamic)

{{ config(
    materialized='incremental',
    partition_by={"fields": "some_date", "data_types": "timestamp"},
    incremental_strategy='bq_insert_overwrite'
) }}
select * from {{ source('raw', 'seed') }}
{% if is_incremental() %}
   where {{ dbt.datediff("some_date", "TIMESTAMP'2000-01-01 00:00:00'", 'day') }} > 0
{% endif %}

Raw Materialization Example

{{ config(
    materialized='raw'
) }}
create table test(c1 bigint) lifecycle 1;

This release enhances the functionality of dbt-maxcompute significantly by providing improved incremental strategies and a new raw materialization method, making it easier for users to manage their data transformations effectively.

dbt-maxcompute v1.8.0-a13

25 Dec 07:54
Compare
Choose a tag to compare

New Features

1. Support for Creating Ordinary Partition Tables via partition_py

Users can now utilize partition_py to create ordinary partition tables for better data management and querying.

2. Support for Creating Materialized Views

Support for materialized views has been introduced, enabling users to query and manage data more efficiently.

Usage

Creating an Ordinary Partition Table

To create an ordinary partition table, you can use the following configuration:

{{ config(
    materialized='table',
    partition_by={"fields": "name,some_date", "data_types": "string,string"}
) }}
select id, name, some_date from {{ source('raw', 'seed') }}

Schema of the materialized table like:

create table model(id bigint) partitioned by(name string, some_date string)

Creating an Auto-Partitioned Table

Here is an example configuration for creating an auto-partitioned table:

{{ config(
  materialized='table',
  partition_by={"fields": "some_date", "data_types": "timestamp", "granularity": "day"}
) }}
select id, name, some_date from {{ source('raw', 'seed') }}

Schema of the materialized table like:

create table model(id bigint, name string, some_date timestamp) auto partitioned by trunc_time(some_date, "day");

Creating a Materialized View

The configuration for creating a materialized view is as follows:

{{ config(
    materialized='materialized_view',
    lifecycle=1,
    build_deferred=True,
    columns=["id", "name", "some_date"],
    column_comment={"id": "this is id.", "name": "this is name."},
    disable_rewrite=True,
    table_comment="this is a materialized view.",
    tblProperties={"compressionstrategy":"normal"}
) }}
select * from {{ source('raw', 'seed') }}

For the meanings of each field, please refer to the Alibaba Cloud MaxCompute User Guide.

Limitations

  1. The current version does not support specifying partitions when refreshing materialized views.
  2. Materialized views do not support the rename operation.
  3. It is not possible to switch to another materialization method without dropping the materialized table.

dbt-maxcompute v1.8.0-a12

20 Dec 05:46
Compare
Choose a tag to compare

New Features

  • Support for specifying partition_py during table materialization to automatically create auto-partitioned tables.
  • Introduced a new date_spine macro for generating date dimension tables.
  • Modified the dateadd macro to add support for hour and week partitions.
  • Refactored the datediff macro to simplify it and support more date intervals.

Full Changelog: v1.8.0-a11...v1.8.0-a12

dbt-maxcompute v1.8.0-a11

10 Dec 10:40
Compare
Choose a tag to compare
Pre-release

What's Changed

  • feat: 增加 insert_overwrite strategy 的支持,支持创建 auto-partition 表 by @dingxin-tech in #4

Full Changelog: v1.8.0-a10...v1.8.0-a11

dbt-maxcompute v1.8.0-a10

04 Dec 11:06
Compare
Choose a tag to compare
Pre-release

Version: v1.8.0-a10
Release Date: 2024.12.04

New Features

  • Micro Related to apply_grants.sql:

    • Added micros that supports MaxCompute permission operations, enabling flexible management of data permissions through SQL statements for granting and revoking specific user or group permissions.
  • create_or_replace_clone:

    • Implemented a new micro that supports the dbt clone operation, allowing users to quickly create modified or variant models from existing ones.
  • persist_doc:

    • Introducing a new micro that supports persisting descriptions into a comments table in the database, allowing for the permanent storage of relevant document information for future reference.
  • query_header:

    • Added support for query_header, enabling users to include custom header information during SQL query execution for additional context and improved debugging.
  • Various Bug Fixes:

    • Addressed multiple known issues to enhance system stability and reliability, including error handling improvements and performance optimizations.

dbt-maxcompute v1.8.0-a9

28 Nov 03:02
Compare
Choose a tag to compare
Pre-release

Version: v1.8.0-a9
Release Date: 2024.11.28

New Features

  • Enhanced Type System:

    • Added several MaxCompute type aliases:
      • "INTEGER" -> "INT"
      • "BOOL" -> "BOOLEAN"
      • "NUMERIC" -> "DECIMAL"
      • "REAL" -> "FLOAT"
    • The NUMERIC type is now converted to the DECIMAL type in MaxCompute.
    • Fixed implementations of dbt's is_string, is_integer, and is_numeric methods.
  • Updated Constraint Support Matrix:

    • check: Not supported
    • unique: Not supported
    • primary_key: Not supported
    • foreign_key: Not supported
    • not_null: Supported and enforced
  • New Incremental Strategy Support: Now supports the following three incremental strategies:

    • append
    • merge (requires unique key)
    • delete + insert (requires unique key)
    • Note: When a unique key is specified, the materialized table will be created as a transactional table.
  • Using DataFrame API for dbt seed Table Operations: Added a multiple retry mechanism.

  • Enhanced create_table_as_micro Feature: Supports constraint handling and can create Delta Tables by specifying primary_keys and delta_table_bucket_num in the configuration.

  • Fixed Catalog Creation Accuracy:

    • Column indexing changed from starting at 0 to starting at 1.
    • Ability to recognize view and table types.
  • Fixed Logic Issue in validate_sql.

This version has been validated and passed a total of 58 tests.

dbt-maxcompute v1.8.0-a8

21 Nov 09:04
Compare
Choose a tag to compare
Pre-release

Version: v1.8.0-a8
Release Date: 2024.11.21

New Features

  • Added support for the following SQL syntax, and add the corresponding 37 tests:
    • validate_sql
    • any_value
    • array_append
    • array_concat
    • array_construct
    • bool_or
    • cast
    • cast_bool_to_text
    • concat
    • date
    • get_intervals_between
    • date_spine
    • date_trunc
    • dateadd
    • datediff
    • escape_single_quotes
    • hash
    • last_day
    • listagg
    • position
    • right
    • split_part

Changes

  • All string types are now standardized to lowercase.
  • The timezone for date types is set to UTC.
  • The method for reading CSV tables with dbt seed has been changed from a combination of SQL to the PyOdps DataFrame API, significantly improving the efficiency of loading tables from files.

We encourage users to download and try the new version, and we welcome feedback to help us continue improving!

dbt-maxcompute v1.8.0-a7

21 Nov 08:47
Compare
Choose a tag to compare
Pre-release

Release Version: v1.8.0-a7
Release Date: 2024.11.14

Summary:
We are pleased to announce the release of dbt-maxcompute version 1.8.0-a7, which represents the first planned version of the MC dbt connector. This release successfully passes 10 basic tests as defined by dbt, covering key user scenarios. Please note that the current version does not support specific features of MaxCompute, such as partitioning, clustering, or tiered storage capabilities.

Links:

Detailed Description

The Basic Tests included in this release encompass the following key scenarios:

  • Validation of dbt’s ability to create and transform data under simple materialization settings while ensuring consistency.
  • Verification of dbt's handling of simple SQL tests (both successful and failed) to confirm expected outcomes.
  • Testing of ephemeral models along with their related SQL testing files to ensure correct creation, execution, and validation.
  • Assessment of dbt's effect when running with the --empty parameter.
  • Validation of outputs from running ephemeral models, including table data, documentation directory, and manifest file results.
  • Confirmation of dbt's expected behavior when modifying data within incremental model tables.
  • Utilization of pytest's fixture functionality in dbt to validate operations such as seed, run, and test.
  • Testing of dbt snapshot functionality, verifying successful creation of snapshots using run_dbt(["snapshot"]), and checking the row counts against expectations using check_relation_rows.
  • Ensuring the correctness of snapshot functions under various data change scenarios.
  • Dynamic generation and execution of SQL files, along with validation of consistency in the actual data within MaxCompute.

Usage Limitations

Please be aware of the following limitations in this version:

  • Does not support the creation and use of MaxCompute partitioned tables, clustered tables, external tables, or Delta tables.
  • For data updates that modify existing records within a table, the tables involved must be transactional tables.
  • A new definition for creating transactional tables has been introduced in dbt-maxcompute, as shown below:
{{ config(
    materialized='table',
    transactional=true
) }}

select c_custkey as customer_id,
       c_name as customer_name,
       c_phone as customer_phone
from BIGDATA_PUBLIC_DATASET.tpch_10g.customer

We encourage you to try out the new features and provide feedback to help us improve the connector further!