Release v0.44.0 #2957

nfx · 2024-10-14T16:50:11Z

Added imbalanced-learn to known list (#2943). A new open-source library, "imbalanced-learn," has been added to the project's known list of libraries, providing various functionalities for handling imbalanced datasets. The addition includes modules such as "imblearn", "imblearn._config", "imblearn._min_dependencies", "imblearn._version", "imblearn.base", and many others, enabling features such as over-sampling, under-sampling, combining sampling techniques, and creating ensembles. This change partially resolves issue #1931, which may have been related to the handling of imbalanced datasets, thereby enhancing the project's ability to manage such datasets.
Added importlib_resources to known list (#2944). In this update, we've added the importlib_resources package to the known list in the known.json file. This package offers a consistent and straightforward interface for accessing resources such as data files and directories in Python packages. It includes several modules, including importlib_resources, importlib_resources._adapters, importlib_resources._common, importlib_resources._functional, importlib_resources._itertools, importlib_resources.abc, importlib_resources.compat, importlib_resources.compat.py38, importlib_resources.compat.py39, importlib_resources.future, importlib_resources.future.adapters, importlib_resources.readers, and importlib_resources.simple. These modules provide various functionalities for handling resources within a Python package. By adding this package to the known list, we enable its usage and integration with the project's codebase. This change partially addresses issue #1931, improving the management and accessibility of resources within our Python packages.
Dependency update: ensure we install with at least version 0.9.1 of databricks-labs-blueprint (#2950). In the updated pyproject.toml file, the version constraint for the databricks-labs-blueprint dependency has been revised to range between 0.9.1 and 0.10, specifically targeting 0.9.1 or higher. This modification ensures the incorporation of a fixed upstream issue (databrickslabs/blueprint#157), which was integrated in the 0.9.1 release. This adjustment was triggered by a preceding change (#2920) that standardized notebook paths, thereby addressing issue #2882, which was dependent on this upstream correction. By embracing this upgrade, users can engage the most recent dependency version, thereby ensuring the remediation of the aforementioned issue.
Fixed an issue with source table deleted after migration (#2927). In this release, we have addressed an issue where a source table was marked as migrated even after it was deleted following migration. An exception handling mechanism has been added to the is_migrated method to return True and log a warning message if the source table does not exist, indicating that it has been migrated. A new test function, test_migration_index_deleted_source, has also been included to verify the migration index behavior when the source table no longer exists. This function creates a source and destination table, sets the destination table's upgraded_from property to the source table, drops the source table, and checks if the migration index contains the source table and if an error message was recorded, indicating that the source table no longer exists. The get_seen_tables method remains unchanged in this diff.
Improve robustness of sqlglot failure handling (#2952). This PR introduces changes to improve the robustness of error handling in the sqlglot library, specifically targeting issues with inadequate parsing quality. The collect_table_infos method has been updated and renamed to collect_used_tables to accurately gather information about tables used in a SQL expression. The lint_expression and collect_tables methods have also been updated to use the new collect_used_tables method for better accuracy. Additionally, methods such as find_all, walk_expressions, and the test suite for the SQL parser have been enhanced to handle potential failures and unsupported SQL syntax more gracefully, by returning empty lists or logging warning messages instead of raising errors. These changes aim to improve the reliability and robustness of the sqlglot library, enabling it to handle unexpected input more effectively.
Log warnings when mounts are discovered on incorrect cluster type (#2929). The migrate-tables command in the ucx project's CLI now includes a verification step to ensure the successful completion of a prerequisite assessment workflow before execution. If this workflow has not been completed, a warning message is logged and the command is not executed. A new exception handling mechanism has been implemented for the dbutils.fs.mounts() method, which logs a warning and skips mount point discovery if an exception is raised. A new unit test has been added to verify that a warning is logged when attempting to discover mounts on an incompatible cluster type. The diff also includes a new method VerifyProgressTracking for verifying progress tracking and updates to existing test methods to include verification of successful runs and error handling before assessment. These changes improve the handling of edge cases in the mount point discovery process, add warnings for mounts on incorrect cluster types, and increase test coverage with progress tracking verification.
create-uber-principal fixes and improvements (#2941). This change introduces fixes and improvements to the create-uber-principal functionality within the databricks-sdk-py project, specifically targeting the Azure access module. The main enhancements include addressing an issue with the Databricks warehouses API by adding the set_workspace_warehouse_config_wrapper function, modifying the command to request the uber principal name only when necessary, improving storage account crawl logic, and introducing new methods to manage workspace-level configurations. Error handling mechanisms have been fortified through added and modified try-except blocks. Additionally, several unit and integration tests have been implemented and verified to ensure the functionality is correct and running smoothly. These changes improve the overall robustness and versatility of the create-uber-principal command, directly addressing issues #2764, #2771, and progressing on #2949.

* Added `imbalanced-learn` to known list ([#2943](#2943)). A new open-source library, "imbalanced-learn," has been added to the project's known list of libraries, providing various functionalities for handling imbalanced datasets. The addition includes modules such as "imblearn", "imblearn._config", "imblearn._min_dependencies", "imblearn._version", "imblearn.base", and many others, enabling features such as over-sampling, under-sampling, combining sampling techniques, and creating ensembles. This change partially resolves issue [#1931](#1931), which may have been related to the handling of imbalanced datasets, thereby enhancing the project's ability to manage such datasets. * Added `importlib_resources` to known list ([#2944](#2944)). In this update, we've added the `importlib_resources` package to the known list in the `known.json` file. This package offers a consistent and straightforward interface for accessing resources such as data files and directories in Python packages. It includes several modules, including `importlib_resources`, `importlib_resources._adapters`, `importlib_resources._common`, `importlib_resources._functional`, `importlib_resources._itertools`, `importlib_resources.abc`, `importlib_resources.compat`, `importlib_resources.compat.py38`, `importlib_resources.compat.py39`, `importlib_resources.future`, `importlib_resources.future.adapters`, `importlib_resources.readers`, and `importlib_resources.simple`. These modules provide various functionalities for handling resources within a Python package. By adding this package to the known list, we enable its usage and integration with the project's codebase. This change partially addresses issue [#1931](#1931), improving the management and accessibility of resources within our Python packages. * Dependency update: ensure we install with at least version 0.9.1 of `databricks-labs-blueprint` ([#2950](#2950)). In the updated `pyproject.toml` file, the version constraint for the `databricks-labs-blueprint` dependency has been revised to range between 0.9.1 and 0.10, specifically targeting 0.9.1 or higher. This modification ensures the incorporation of a fixed upstream issue (databrickslabs/blueprint[#157](#157)), which was integrated in the 0.9.1 release. This adjustment was triggered by a preceding change ([#2920](#2920)) that standardized notebook paths, thereby addressing issue [#2882](#2882), which was dependent on this upstream correction. By embracing this upgrade, users can engage the most recent dependency version, thereby ensuring the remediation of the aforementioned issue. * Fixed an issue with source table deleted after migration ([#2927](#2927)). In this release, we have addressed an issue where a source table was marked as migrated even after it was deleted following migration. An exception handling mechanism has been added to the `is_migrated` method to return `True` and log a warning message if the source table does not exist, indicating that it has been migrated. A new test function, `test_migration_index_deleted_source`, has also been included to verify the migration index behavior when the source table no longer exists. This function creates a source and destination table, sets the destination table's `upgraded_from` property to the source table, drops the source table, and checks if the migration index contains the source table and if an error message was recorded, indicating that the source table no longer exists. The `get_seen_tables` method remains unchanged in this diff. * Improve robustness of `sqlglot` failure handling ([#2952](#2952)). This PR introduces changes to improve the robustness of error handling in the `sqlglot` library, specifically targeting issues with inadequate parsing quality. The `collect_table_infos` method has been updated and renamed to `collect_used_tables` to accurately gather information about tables used in a SQL expression. The `lint_expression` and `collect_tables` methods have also been updated to use the new `collect_used_tables` method for better accuracy. Additionally, methods such as `find_all`, `walk_expressions`, and the test suite for the SQL parser have been enhanced to handle potential failures and unsupported SQL syntax more gracefully, by returning empty lists or logging warning messages instead of raising errors. These changes aim to improve the reliability and robustness of the `sqlglot` library, enabling it to handle unexpected input more effectively. * Log warnings when mounts are discovered on incorrect cluster type ([#2929](#2929)). The `migrate-tables` command in the ucx project's CLI now includes a verification step to ensure the successful completion of a prerequisite assessment workflow before execution. If this workflow has not been completed, a warning message is logged and the command is not executed. A new exception handling mechanism has been implemented for the `dbutils.fs.mounts()` method, which logs a warning and skips mount point discovery if an exception is raised. A new unit test has been added to verify that a warning is logged when attempting to discover mounts on an incompatible cluster type. The diff also includes a new method `VerifyProgressTracking` for verifying progress tracking and updates to existing test methods to include verification of successful runs and error handling before assessment. These changes improve the handling of edge cases in the mount point discovery process, add warnings for mounts on incorrect cluster types, and increase test coverage with progress tracking verification. * `create-uber-principal` fixes and improvements ([#2941](#2941)). This change introduces fixes and improvements to the `create-uber-principal` functionality within the `databricks-sdk-py` project, specifically targeting the Azure access module. The main enhancements include addressing an issue with the Databricks warehouses API by adding the `set_workspace_warehouse_config_wrapper` function, modifying the command to request the uber principal name only when necessary, improving storage account crawl logic, and introducing new methods to manage workspace-level configurations. Error handling mechanisms have been fortified through added and modified try-except blocks. Additionally, several unit and integration tests have been implemented and verified to ensure the functionality is correct and running smoothly. These changes improve the overall robustness and versatility of the `create-uber-principal` command, directly addressing issues [#2764](#2764), [#2771](#2771), and progressing on [#2949](#2949).

nfx requested a review from a team as a code owner October 14, 2024 16:50

nfx had a problem deploying to account-admin October 14, 2024 16:50 — with GitHub Actions Failure

nfx merged commit 9c1da79 into main Oct 14, 2024
6 of 7 checks passed

nfx deleted the prepare/0.44.0 branch October 14, 2024 16:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.44.0 #2957

Release v0.44.0 #2957

nfx commented Oct 14, 2024

Release v0.44.0 #2957

Release v0.44.0 #2957

Conversation

nfx commented Oct 14, 2024