Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks: stop including user names in list_jobs #40178

Merged
merged 3 commits into from
Jun 14, 2024

Conversation

stephenpurcell-db
Copy link
Contributor

@stephenpurcell-db stephenpurcell-db commented Jun 11, 2024

When querying list_jobs, by default the owning user's name is resolved. This field is not used on the Airflow side, and the lookup is (relatively) expensive.

With this new argument, the lookup is skipped. This will make requests faster and more reliable.


Copy link

boring-cyborg bot commented Jun 11, 2024

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

@stephenpurcell-db stephenpurcell-db marked this pull request as ready for review June 11, 2024 14:55
@stephenpurcell-db stephenpurcell-db force-pushed the databricks-list-jobs-user-names branch from 0ee91c7 to df3bc11 Compare June 12, 2024 08:10
@potiuk
Copy link
Member

potiuk commented Jun 12, 2024

Isn't "false" default doing the same?

@stephenpurcell-db
Copy link
Contributor Author

stephenpurcell-db commented Jun 12, 2024

Isn't "false" default doing the same?

@potiuk include_user_names=false will skip name resolution. Omitting this param defaults it to true in the API.
I provide a default value in list_jobs to allow other usages to explicitly set it to true, following the pattern already used by expand_tasks.

@potiuk
Copy link
Member

potiuk commented Jun 12, 2024

Right :D

@potiuk
Copy link
Member

potiuk commented Jun 12, 2024

tests need fixing though

The user's name is not used on the Airflow side, and this argument saves the lookup, which makes the request faster.
@stephenpurcell-db stephenpurcell-db force-pushed the databricks-list-jobs-user-names branch from df3bc11 to 7ef3178 Compare June 13, 2024 10:51
@stephenpurcell-db
Copy link
Contributor Author

@potiuk thanks! I think the tests should pass now.

@potiuk
Copy link
Member

potiuk commented Jun 13, 2024

Almost. BTW. Installing pre-commit and running them on the change should auto-fix all static errors.

@stephenpurcell-db stephenpurcell-db force-pushed the databricks-list-jobs-user-names branch from 7ef3178 to d21acac Compare June 14, 2024 12:34
@stephenpurcell-db
Copy link
Contributor Author

stephenpurcell-db commented Jun 14, 2024

@potiuk rectified now by running the pre-commit hooks, thank you for the guidance.

@potiuk potiuk merged commit a1f9b7d into apache:main Jun 14, 2024
50 checks passed
Copy link

boring-cyborg bot commented Jun 14, 2024

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

@stephenpurcell-db stephenpurcell-db deleted the databricks-list-jobs-user-names branch June 17, 2024 11:38
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Jul 26, 2024
* Databricks: stop including user names in `list_jobs`

The user's name is not used on the Airflow side, and this argument saves the lookup, which makes the request faster.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants