-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REFACTOR: Unite Your Requirements! #1068
Conversation
P.S. I've updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there! Thanks for your PR. We definitly started to be opinionated to suggest uv in our contribution guide. However I have mixed feelings about this one. Let me try to mention a few reason (sorted by dev experience):
- If someone's workflow is not on uv, then how can he/she/they pick up dev-dependencies?
uv sync
is too strict for my (but I guess many) development settings and to avoid that it doesn't uninstall the rest of the libraries in the env, I need to remember to runuv sync --inexact
instead- For a few set up based on docker (e.g. devcontainers and codespaces), having a file allows to easily copy and install the dependencies before even copying the python package. This leads to easier caching (at least in my experience).
pyproject.toml
Outdated
modin = ["modin"] | ||
pandas = ["pandas>=0.25.3"] | ||
polars = ["polars>=0.20.3"] | ||
pyarrow = ["pyarrow>=11.0.0"] | ||
dask = ["dask[dataframe]>=2024.7"] | ||
dask = ["dask[dataframe]>=2023.5.0"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Marco recalls why that was the min version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this was resolved, but 2024.7 is indeed the min version we can support, as we need dask_expr.
Buongiorno Francesco (@FBruzzesi), Embracing change can indeed be challenging, and that's perfectly normal.
It's understandable that this might seem like a hurdle. However, I see it as an exciting chance to introduce the efficiency of
I understand that might be inconvenient, but this rigidity also promotes consistency in workflows.
From what I understand, Docker caching should remain intact whether using
Please let me know if there's anything specific you find challenging, and I’d be happy to discuss further. |
Definitely, and we would like to let people use whatever workflow works for them (not only pip) as well. Thanks for adjusting the pyproject section.
The above snippet would run from ...
COPY ./requirements.txt /requirements.txt
RUN uv pip install -r requirements.txt # this will be cached until something changes in requirements.txt
COPY <rest of codebase> <rest of codebase>
RUN uv pip install .
... There should be a way to link requirements.txt dynamically into pyproject, so personally that's how I would do it but happy to discuss |
thanks - you'll need to update the instructions for this to work (e.g. ci, contributing) fancy making a PR first, so that github automatically runs ci for you, and then coming back to this? |
Ciao @FBruzzesi,
When trying on the local machine it felt like there was a caching of
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @akmalsoliev thanks again for the effort. The workflow you are proposing seems to work quite well and being able to cache layers properly in docker.
Given that is sorted out, I left some comments that probably require attention and, as @MarcoGorelli mentioned, we need this to be incorporated in the ci/cd workflows properly.
If possible including nox and the contributing documentation as well!
Let me know if I can help in any way forward 😉
pyproject.toml
Outdated
modin = ["modin"] | ||
pandas = ["pandas>=0.25.3"] | ||
polars = ["polars>=0.20.3"] | ||
pyarrow = ["pyarrow>=11.0.0"] | ||
dask = ["dask[dataframe]>=2024.7"] | ||
dask = ["dask[dataframe]>=2023.5.0"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this was resolved, but 2024.7 is indeed the min version we can support, as we need dask_expr.
pyproject.toml
Outdated
dev = [ | ||
"covdefaults>=2.3.0", | ||
"duckdb>=1.1.1", | ||
"hypothesis>=6.112.1", | ||
"pandas>=2.0.3", | ||
"polars>=1.8.1", | ||
"pre-commit>=3.5.0", | ||
"pyarrow>=17.0.0", | ||
"pytest>=8.3.3", | ||
"pytest-cov>=5.0.0", | ||
"pytest-env>=1.1.5", | ||
"pytest-randomly>=3.15.0", | ||
"scikit-learn>=1.3.2", | ||
"tqdm>=4.66.5", | ||
"typing-extensions>=4.12.2", | ||
"dask[dataframe]>=2023.5.0" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For running in ci/cd we should not pin any of these, as we want to have some old and random versions as well. Also dask requires python 3.9 or above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not an issue, was more of demonstration, will be fixed
pyproject.toml
Outdated
"mkdocstrings[python]>=0.26.1", | ||
"polars>=1.0.0", | ||
] | ||
#cudf = ["cudf>=23.08.00"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for commenting this one out? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, sorry for the delayed response. If I recall correctly, @MarcoGorelli brought up this issue regarding installing cuDF with conda. The installation process is quite complex, and I'm not sure how to resolve it.
With the arrival of a lightning-fast package manager, mighty UV, the requirements.txt is like yesterday's news. Instead of navigating through a jungle of dependencies, like a mismatched sock drawer, everything can now unite in one glorious spot—just like Italy in 1861!
This marvelous change even makes installing dependencies blazingly fast (except for cudf—polars is better anyways). Now it's as easy as pie with just one command: uv sync --all-extras. No more typing three commands when you can do it in one! 1 > 3 — gives you more time to enjoy that delicious Italian caffè.
What type of PR is this? (check all applicable)
Checklist
If you have comments or can explain your changes, please do so below.