Skip to content

Commit

Permalink
add break change & bump version
Browse files Browse the repository at this point in the history
  • Loading branch information
jczhong84 committed Nov 28, 2023
1 parent db4ae1e commit 62f7cc4
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 21 deletions.
9 changes: 9 additions & 0 deletions docs_website/docs/changelog/breaking_change.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,21 @@ slug: /changelog

Here are the list of breaking changes that you should be aware of when updating Querybook:

## v3.29.0

Made below changes for `S3BaseExporter` (csv table uploader feature):

- Both `s3_path` and `use_schema_location` are optional now
- If none is provided, or `use_schema_location=False`, the table will be created as managed table, whose location will be determined by the query engine.

## v3.27.0

Updated properties of `QueryValidationResult` object. `line` and `ch` are replaced with `start_line` and `start_ch` respectively.

## v3.22.0

Updated the charset of table `data_element` to `utf8mb4`. For those mysql db's default charset is not utf8, please run below sql to update it if needed.

```sql
ALTER TABLE data_element CONVERT TO CHARACTER SET utf8mb4
```
Expand Down
8 changes: 4 additions & 4 deletions docs_website/docs/integrations/add_table_upload.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,17 +42,17 @@ Included by default: No

Available options:

Either s3_path or use_schema_location must be supplied.

- s3_path (str): if supplied, will use it as the root path for upload. Must be the full s3 path like s3://bucket/key, the trailing / is optional.
- use_schema_location (boolean):
- s3_path (str, optional): if supplied, will use it as the root path for upload. Must be the full s3 path like s3://bucket/key, the trailing / is optional.
- use_schema_location (boolean, optional):
if true, the upload root path is inferred by locationUri specified by the schema/database in HMS. To use this option, the engine must be connected to a metastore that uses
HMSMetastoreLoader (or its derived class).
if false, it will be created as managed table, whose location will be determined automatically by the query engine.
- table_properties (List[str]): list of table properties passed, this must be query engine specific.
Checkout here for examples in SparkSQL: https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-create-table-hiveformat.html#examples
For Trino/Presto, it would be the WITH statement: https://trino.io/docs/current/sql/create-table.html

If neither s3_path nor use_schema_location is supplied, it will be treated same as `use_schema_location=False``, and it will be created as managed table.

### S3 Parquet exporter

This would upload a Parquet file instead of a CSV file. In addition to dependencies such as boto3, `pyarrow` must also be installed.
Expand Down
30 changes: 14 additions & 16 deletions querybook/server/lib/table_upload/exporter/s3_exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,17 +36,13 @@
- table_properties (List[str]): list of table properties passed, this must be query engine specific.
Checkout here for examples in SparkSQL: https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-create-table-hiveformat.html#examples
For Trino/Presto, it would be the WITH statement: https://trino.io/docs/current/sql/create-table.html
If neither s3_path nor use_schema_location is provided, it will be treated same as `use_schema_location=False``,
and it will be created as managed table.
"""


class S3BaseExporter(BaseTableUploadExporter):
def __init__(self, exporter_config: dict = {}):
if ("s3_path" not in exporter_config) and (
"use_schema_location" not in exporter_config
):
raise Exception("Either s3_path or use_schema_location must be specified")
super().__init__(exporter_config)

@abstractmethod
def UPLOAD_FILE_TYPE(cls) -> str:
"""Override this to specify what kind of file is getting uploaded
Expand Down Expand Up @@ -85,7 +81,7 @@ def destination_s3_folder(self, session=None) -> str:
metastore = get_metastore_loader(query_engine.metastore_id, session=session)

if metastore is None:
raise Exception("Invalid metastore")
raise Exception("Invalid metastore for table upload")

if self._exporter_config.get("use_schema_location", False):
schema_location_uri = metastore.get_schema_location(schema_name)
Expand All @@ -96,12 +92,12 @@ def destination_s3_folder(self, session=None) -> str:

# Use its actual location for managed tables
table_location = metastore.get_table_location(schema_name, table_name)
if table_location:
return sanitize_s3_url(table_location)

raise Exception(
"Cant get the table location from metastore. Please make sure the query engine supports managed table with default location."
)
if not table_location:
raise Exception(
"Cant get the table location from metastore. Please make sure the query engine supports managed table with default location."
)
return sanitize_s3_url(table_location)

@with_session
def _handle_if_table_exists(self, session=None):
Expand All @@ -125,14 +121,16 @@ def _handle_if_table_exists(self, session=None):
def _get_table_create_query(self, session=None) -> str:
query_engine = get_query_engine_by_id(self._engine_id, session=session)
schema_name, table_name = self._fq_table_name
is_managed = self._exporter_config.get("use_schema_location") is False
is_external = "s3_path" in self._exporter_config or self._exporter_config.get(
"use_schema_location"
)
return get_create_table_statement(
language=query_engine.language,
table_name=table_name,
schema_name=schema_name,
column_name_types=self._table_config["column_name_types"],
# table location is not needed for managed (non external) table creation
file_location=None if is_managed else self.destination_s3_folder(),
# table location is only needed for external (non managed) table creation
file_location=self.destination_s3_folder() if is_external else None,
file_format=self.UPLOAD_FILE_TYPE(),
table_properties=self._exporter_config.get("table_properties", []),
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,9 +78,13 @@ export const TableUploaderForm: React.FC<ITableUploaderFormProps> = ({
});

// sometimes there will be sync delay between the metastore and querybook
// skip the redirect if the table has not been synced over.
// skip the redirection if the table has not been synced over.
if (tableId) {
navigateWithinEnv(`/table/${tableId}`);
} else {
toast(
'Waiting for the table to be synced over from the metastore.'
);
}
onHide();
},
Expand Down

0 comments on commit 62f7cc4

Please sign in to comment.