Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: UCX Assessment workflow fails with error -- DBFS library installations are not supported on DBR 15 or above. #1096

Closed
1 task done
rkkalluri opened this issue Mar 25, 2024 · 4 comments · Fixed by #1097

Comments

@rkkalluri
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Describe the bug

UCX Assessment workflow fails with error

run failed with error message
Library installation failed for library due to user error for whl: "dbfs:/Applications/ucx/wheels/databricks_labs_ucx-0.18.0-py3-none-any.whl"
Error messages:
Library installation attempted on the driver node of cluster 0325-034406-7snc4yap and failed. Please refer to the following error message to fix the library or contact Databricks support. Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: com.databricks.api.base.DatabricksServiceException: BAD_REQUEST: DBFS library installations are not supported on DBR 15 or above.

I have updated the compute profile generated by the installer to downgrade to 14.3, but I am unable to change the workflow to a 14.3 it keeps complaining that 14.3 is not in the list of 15.0

Steps to Reproduce

Install UCX with databricks ucx labs install --profile dbx-profile
Run Assessment workflow

Setup and versions

UCX - 0.18
Databricks cli - 0.215

Expected Behavior

No response

Steps To Reproduce

No response

Cloud

Azure

Operating System

Windows

Version

latest via Databricks CLI

Relevant log output

No response

Copy link
Collaborator

nfx commented Mar 25, 2024

Thanks for the report, we’ll look into it shortly

@nfx
Copy link
Collaborator

nfx commented Mar 25, 2024

@rkkalluri as a stopgap, can you change the UCX cluster policy and use the LTS runtime instead?

nfx added a commit that referenced this issue Mar 25, 2024
@nfx nfx closed this as completed in #1097 Mar 25, 2024
nfx added a commit that referenced this issue Mar 25, 2024
Fixes #1096

This is a quickfix. The long-term solution is tracked in #1098
nfx added a commit that referenced this issue Mar 25, 2024
* Added instance pool id to WorkspaceConfig ([#1087](#1087)). In this release, the `create` method of the `_policy_installer` object has been updated to return an additional value, `instance_pool_id`, which is then assigned and passed as an argument to the `WorkspaceConfig` object in the `_configure_new_installation` method. The `ClusterPolicyInstaller` class in the `v0.15.0_added_cluster_policy.py` file has also been updated to return a fourth value, `instance_pool_id`, from the `create` method, allowing for more flexibility in future enhancements. Additionally, the test function `test_table_migration_job` in the `test_installation.py` file has been updated to skip when the script is not being run as part of a nightly test job or in debug mode, and the test functions in the `test_policy.py` file have been updated to reflect the new return value in the `create` method. These changes enable better management and scaling of resources through instance pools, provide more granular control in the WorkspaceConfig, and improve testing efficiency.
* Added more cross-linking between CLI commands ([#1091](#1091)). In this release, we have introduced several enhancements to our open-source library's Command Line Interface (CLI) and documentation. Specifically, we have added more cross-linking between CLI commands to improve navigation and usability. The documentation has been updated to include a new step in the UCX installation process, where users are required to run the assessment workflow after installing UCX. This workflow is the first step in the migration process and checks the compatibility of the user's workspace with Unity Catalog. Additionally, we have added new commands for principal-prefix-access, migrate-credentials, and migrate-locations, which are part of the table migration process. These new commands require the assessment workflow and group migration workflow to be completed before they can be executed. Overall, these changes aim to provide a more streamlined and detailed installation and migration process, improving the user experience for software engineers.
* Fixed command references in README.md ([#1093](#1093)). In this release, we have made improvements to the command references in the README.md file to enhance the overall readability and usability of the documentation for software engineers. Specifically, we have updated the links for the `migrate-locations` and `validate_external_locations` commands to use the correct syntax, enclosing them in backticks to denote code. This change ensures that the links are correctly interpreted as commands and addresses any issues that may have arisen with their previous formatting. It is important to note that no new methods have been added in this release, and the existing functionality of the commands has not been changed in scope or functionality.
* Fixing the issue in workspace id flag in create-account-group command ([#1094](#1094)). In this update, we have improved the `create_account_group` command related to the `workspace_ids` flag in our open-source library. The `workspace_ids` flag's type has been changed from `list[int] | None` to `str | None`, allowing for easier input of multiple workspace IDs as a string of comma-separated integers. The `create_account_level_groups` function in the `AccountWorkspaces` class has been updated to accept this string and convert it to a list of integers before proceeding. To ensure proper functioning, we added a new test case `test_create_account_groups_with_id()` to check if the command handles the case when no workspace IDs are provided in the configuration. The `create_account_groups()` method now checks for this condition and raises a `ValueError`. Furthermore, the `manual_workspace_info()` method has been updated to handle workspace name input by the user, receiving the `ws` object, along with prompts that contain the user input for the workspace name and the next workspace ID.
* Rely UCX on the latest 14.3 LTS DBR instead of 15.x ([#1097](#1097)). In this release, we have implemented a quick fix to rely on the Long Term Support (LTS) version 14.3 of the Databricks Runtime (DBR) instead of 15.x for UCX, addressing issue [#1096](#1096). This change affects the `_definition` function, which has been modified to use the latest LTS DBR instead of the latest Spark version. The `latest_lts_dbr` variable is now assigned the value returned by the `select_spark_version` method with the `latest=True` and `long_term_support=True` parameters. The `spark_version` key in the `policy_definition` dictionary is set to the value returned by the `_policy_config` method with `latest_lts_dbr` as the argument. Additionally, in the `tests/unit/installer/test_policy.py` file, the `select_spark_version` method of the `clusters` object has been updated to accept any number of arguments and consistently return the string "14.2.x-scala2.12", allowing for greater flexibility. This is a temporary solution, with a more comprehensive fix being tracked in issue [#1098](#1098). Developers should be aware of how the `clusters` object is used in the codebase when adopting this project.
@nfx nfx mentioned this issue Mar 25, 2024
nfx added a commit that referenced this issue Mar 25, 2024
* Added instance pool id to WorkspaceConfig
([#1087](#1087)). In this
release, the `create` method of the `_policy_installer` object has been
updated to return an additional value, `instance_pool_id`, which is then
assigned and passed as an argument to the `WorkspaceConfig` object in
the `_configure_new_installation` method. The `ClusterPolicyInstaller`
class in the `v0.15.0_added_cluster_policy.py` file has also been
updated to return a fourth value, `instance_pool_id`, from the `create`
method, allowing for more flexibility in future enhancements.
Additionally, the test function `test_table_migration_job` in the
`test_installation.py` file has been updated to skip when the script is
not being run as part of a nightly test job or in debug mode, and the
test functions in the `test_policy.py` file have been updated to reflect
the new return value in the `create` method. These changes enable better
management and scaling of resources through instance pools, provide more
granular control in the WorkspaceConfig, and improve testing efficiency.
* Added more cross-linking between CLI commands
([#1091](#1091)). In this
release, we have introduced several enhancements to our open-source
library's Command Line Interface (CLI) and documentation. Specifically,
we have added more cross-linking between CLI commands to improve
navigation and usability. The documentation has been updated to include
a new step in the UCX installation process, where users are required to
run the assessment workflow after installing UCX. This workflow is the
first step in the migration process and checks the compatibility of the
user's workspace with Unity Catalog. Additionally, we have added new
commands for principal-prefix-access, migrate-credentials, and
migrate-locations, which are part of the table migration process. These
new commands require the assessment workflow and group migration
workflow to be completed before they can be executed. Overall, these
changes aim to provide a more streamlined and detailed installation and
migration process, improving the user experience for software engineers.
* Fixed command references in README.md
([#1093](#1093)). In this
release, we have made improvements to the command references in the
README.md file to enhance the overall readability and usability of the
documentation for software engineers. Specifically, we have updated the
links for the `migrate-locations` and `validate_external_locations`
commands to use the correct syntax, enclosing them in backticks to
denote code. This change ensures that the links are correctly
interpreted as commands and addresses any issues that may have arisen
with their previous formatting. It is important to note that no new
methods have been added in this release, and the existing functionality
of the commands has not been changed in scope or functionality.
* Fixing the issue in workspace id flag in create-account-group command
([#1094](#1094)). In this
update, we have improved the `create_account_group` command related to
the `workspace_ids` flag in our open-source library. The `workspace_ids`
flag's type has been changed from `list[int] | None` to `str | None`,
allowing for easier input of multiple workspace IDs as a string of
comma-separated integers. The `create_account_level_groups` function in
the `AccountWorkspaces` class has been updated to accept this string and
convert it to a list of integers before proceeding. To ensure proper
functioning, we added a new test case
`test_create_account_groups_with_id()` to check if the command handles
the case when no workspace IDs are provided in the configuration. The
`create_account_groups()` method now checks for this condition and
raises a `ValueError`. Furthermore, the `manual_workspace_info()` method
has been updated to handle workspace name input by the user, receiving
the `ws` object, along with prompts that contain the user input for the
workspace name and the next workspace ID.
* Rely UCX on the latest 14.3 LTS DBR instead of 15.x
([#1097](#1097)). In this
release, we have implemented a quick fix to rely on the Long Term
Support (LTS) version 14.3 of the Databricks Runtime (DBR) instead of
15.x for UCX, addressing issue
[#1096](#1096). This change
affects the `_definition` function, which has been modified to use the
latest LTS DBR instead of the latest Spark version. The `latest_lts_dbr`
variable is now assigned the value returned by the
`select_spark_version` method with the `latest=True` and
`long_term_support=True` parameters. The `spark_version` key in the
`policy_definition` dictionary is set to the value returned by the
`_policy_config` method with `latest_lts_dbr` as the argument.
Additionally, in the `tests/unit/installer/test_policy.py` file, the
`select_spark_version` method of the `clusters` object has been updated
to accept any number of arguments and consistently return the string
"14.2.x-scala2.12", allowing for greater flexibility. This is a
temporary solution, with a more comprehensive fix being tracked in issue
[#1098](#1098). Developers
should be aware of how the `clusters` object is used in the codebase
when adopting this project.
@rkkalluri
Copy link
Author

I tried changing the cluster policy and changing the cluster on the workflow to use the new policy. It was not taking 14.3 and failing validation. That may be another bug.

@gvidaspr
Copy link

gvidaspr commented Apr 8, 2024

hello,
have you been able to solve this issue? I am still running into it, Message: com.databricks.api.base.DatabricksServiceException: BAD_REQUEST: DBFS library installations are not supported on DBR 15 or above.

Running on Azure Databricks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants