Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DAB] Add support for requirements libraries in Job Tasks #1543

Merged
merged 5 commits into from
Aug 21, 2024
Merged

[DAB] Add support for requirements libraries in Job Tasks #1543

merged 5 commits into from
Aug 21, 2024

Conversation

witi83
Copy link
Contributor

@witi83 witi83 commented Jul 1, 2024

Changes

While experimenting with DAB I discovered that requirements libraries are being ignored.

One thing worth mentioning is that bundle validate runs successfully, but bundle deploy fails. This PR only covers the second part.

Tests

Added a unit test

Copy link
Contributor

@andrewnester andrewnester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we could reference the requirements.txt from local file system, we need to correctly overwrite the local to remote path, something like we do for wheels
https://github.com/databricks/cli/blob/main/bundle/artifacts/artifacts.go#L181-L184

or for any jobs paths https://github.com/databricks/cli/blob/main/bundle/config/mutator/translate_paths_jobs.go#L21-L54

@witi83
Copy link
Contributor Author

witi83 commented Jul 26, 2024

@andrewnester Thanks for the feedback. I'll try to work on this in the coming days.

@witi83
Copy link
Contributor Author

witi83 commented Jul 29, 2024

Hey @andrewnester I updated the PR (also rebased onto latest main), please check if this goes into the right direction.

I'm not very familiar with the codebase, so I used the existing source and test code as orientation. However, I'm not sure how to write a unit test for translate_paths_jobs.

bundle/artifacts/artifacts.go Outdated Show resolved Hide resolved
bundle/artifacts/artifacts_test.go Outdated Show resolved Hide resolved
bundle/config/mutator/translate_paths_jobs.go Outdated Show resolved Hide resolved
@witi83
Copy link
Contributor Author

witi83 commented Jul 30, 2024

@andrewnester Thanks for the extensive feedback. I adapted the PR accordingly. I was also able to add a test in translate_paths_test

Copy link
Contributor

@andrewnester andrewnester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! I think there's a bit a confusion about approach we need to take, I guess it's better if I explain expectations a bit more.

The way this should work is customers specify the following section in their bundle config

libraries:
- requirements: ./local/path/to/requirements.txt

and nothing else.

What DABs should do is

  1. Upload requirements.txt to workspace file system - this is done automatically similarly to notebooks for example
  2. Rewrite local path to remote path (where requirements.txt was uploaded) in bundle config similarly to how this happens for notebooks now. - This is what needs to be implemented in this PR.

You don't need to use artifacts because artifacts is something that is being built, packaged and then uploaded by DABs, for requirements.txt is not needed.

I hope this clarifies what needs to be done

bundle/config/artifact.go Outdated Show resolved Hide resolved
bundle/artifacts/artifacts_test.go Outdated Show resolved Hide resolved
@witi83
Copy link
Contributor Author

witi83 commented Jul 30, 2024

Hey @andrewnester

Thank you very much! As mentioned, as I'm new to this codebase, I really appreciate your guidance! It's definitely a quite interesting task to get familiar with DAB. 😉

I reverted the changes regarding artifacts (as a requirements.txt is not seen as an artifact).

Please check if it now makes sense to you

@andrewnester
Copy link
Contributor

Confirmed manually that requirements.txt is uploaded and used correctly on DBR 15.1+

@andrewnester andrewnester added this pull request to the merge queue Aug 21, 2024
@andrewnester
Copy link
Contributor

Thanks for contribution @witi83 !

Merged via the queue into databricks:main with commit 192f33b Aug 21, 2024
5 checks passed
@witi83 witi83 deleted the support-req-library-in-tasks branch August 21, 2024 10:13
andrewnester added a commit that referenced this pull request Aug 21, 2024
CLI:
 * Added filtering flags for cluster list commands ([#1703](#1703)).

Bundles:
 * Remove reference to "dbt" in the default-sql template ([#1696](#1696)).
 * Pause continuous pipelines when 'mode: development' is used ([#1590](#1590)).
 * Add configurable presets for name prefixes, tags, etc. ([#1490](#1490)).
 * Report all empty resources present in error diagnostic ([#1685](#1685)).
 * Improves detection of PyPI package names in environment dependencies ([#1699](#1699)).
 * [DAB] Add support for requirements libraries in Job Tasks ([#1543](#1543)).
 * Add paths field to bundle sync configuration ([#1694](#1694)).

Internal:
 * Add `import` option for PyDABs ([#1693](#1693)).
 * Make fileset take optional list of paths to list ([#1684](#1684)).
 * Pass through paths argument to libs/sync ([#1689](#1689)).
 * Correctly mark package names with versions as remote libraries ([#1697](#1697)).
 * Share test initializer in common helper function ([#1695](#1695)).
 * Make `pydabs/venv_path` optional ([#1687](#1687)).
 * Use API mocks for duplicate path errors in workspace files extensions client ([#1690](#1690)).
 * Fix prefix preset used for UC schemas ([#1704](#1704)).
github-merge-queue bot pushed a commit that referenced this pull request Aug 22, 2024
CLI:
* Added filtering flags for cluster list commands
([#1703](#1703)).

Bundles:
* Remove reference to "dbt" in the default-sql template
([#1696](#1696)).
* Pause continuous pipelines when 'mode: development' is used
([#1590](#1590)).
* Add configurable presets for name prefixes, tags, etc.
([#1490](#1490)).
* Report all empty resources present in error diagnostic
([#1685](#1685)).
* Improves detection of PyPI package names in environment dependencies
([#1699](#1699)).
* [DAB] Add support for requirements libraries in Job Tasks
([#1543](#1543)).
* Add paths field to bundle sync configuration
([#1694](#1694)).

Internal:
* Add `import` option for PyDABs
([#1693](#1693)).
* Make fileset take optional list of paths to list
([#1684](#1684)).
* Pass through paths argument to libs/sync
([#1689](#1689)).
* Correctly mark package names with versions as remote libraries
([#1697](#1697)).
* Share test initializer in common helper function
([#1695](#1695)).
* Make `pydabs/venv_path` optional
([#1687](#1687)).
* Use API mocks for duplicate path errors in workspace files extensions
client ([#1690](#1690)).
* Fix prefix preset used for UC schemas
([#1704](#1704)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants