Skip to content

Commit

Permalink
Adding clarity to location_root
Browse files Browse the repository at this point in the history
  • Loading branch information
matthewshaver committed Mar 3, 2025
1 parent 56794f7 commit 33191f6
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion website/docs/reference/resource-configs/spark-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,13 @@ When materializing a model as `table`, you may include several optional configs
| Option | Description | Required? | Example |
|---------|------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------------|
| file_format | The file format to use when creating tables (`parquet`, `delta`, `iceberg`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | `parquet`|
| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | `/mnt/root` |
| location_root [^1] | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | `/mnt/root` |
| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | `date_day` |
| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | `country_code` |
| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | `8` |

[^1]: If `location_root` is configured, dbt specifies a location path in the `create table` statement. The table created changes from being "managed" to being "external" in Spark/Databricks

## Incremental models

dbt seeks to offer useful, intuitive modeling abstractions by means of its built-in configurations and <Term id="materialization">materializations</Term>. Because there is so much variance between Apache Spark clusters out in the world—not to mention the powerful features offered to Databricks users by the Delta file format and custom runtime—making sense of all the available options is an undertaking in its own right.
Expand Down

0 comments on commit 33191f6

Please sign in to comment.