-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: increase PySpark min version to 3.5.0
#1744
chore: increase PySpark min version to 3.5.0
#1744
Conversation
3.5.0
@@ -99,7 +99,7 @@ jobs: | |||
cache-suffix: ${{ matrix.python-version }} | |||
cache-dependency-glob: "pyproject.toml" | |||
- name: install-not-so-old-versions | |||
run: uv pip install tox virtualenv setuptools pandas==2.0.3 polars==0.20.8 numpy==1.24.4 pyarrow==15.0.0 "pyarrow-stubs<17" pyspark==3.4.0 scipy==1.8.0 scikit-learn==1.3.0 dask[dataframe]==2024.10 tzdata --system | |||
run: uv pip install tox virtualenv setuptools pandas==2.0.3 polars==0.20.8 numpy==1.24.4 pyarrow==15.0.0 "pyarrow-stubs<17" pyspark==3.5.0 scipy==1.8.0 scikit-learn==1.3.0 dask[dataframe]==2024.10 tzdata --system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a bonus we can also test pyspark 3.5.0 with numpy <2.0 without adding a new step (useful for std
and var
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @EdAbati !
separately, we should also add this minimum to MINIMUM_VERSIONS
in narwhals/utils.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite satisfying diff! Thanks @EdAbati !
Could you also address the MIN_VERSIONS
dictionary in narwhals/utils.py
?
Edit: just read @MarcoGorelli comment π
What type of PR is this? (check all applicable)
Related issues
all
,any
andnull_count
Spark ExpressionsΒ #1724 (comment)Checklist
If you have comments or can explain your changes, please do so below
Pros: make it easier to implement expressions that comes out-of-the-box with Pyspark 3.5
Cons: may hurt adoption since we only support the latest pyspark version (but pyspark 3.5.0 is already 1+ year old)
I think it is fine to focus on API coverage first and worry about supporting older versions later