Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF/DEPR: Deprecate AbstractMethodError #54408

Closed
wants to merge 2 commits into from
Closed

REF/DEPR: Deprecate AbstractMethodError #54408

wants to merge 2 commits into from

Conversation

twoertwein
Copy link
Member

@twoertwein twoertwein commented Aug 4, 2023

Squashed+rebased version of #48913

closes #48909

@twoertwein twoertwein added Refactor Internal refactoring of code Error Reporting Incorrect or improved errors from pandas labels Aug 4, 2023
@twoertwein twoertwein added this to the 3.0 milestone Aug 4, 2023
@twoertwein twoertwein requested a review from rhshadrach as a code owner August 4, 2023 13:58
@lithomas1 lithomas1 modified the milestones: 3.0, 2.1 Aug 5, 2023
@twoertwein
Copy link
Member Author

ASV is failing on github - how do I know which part(s) significantly slowed down? There is no summary and there seems to be no keyword to grep for

@mroeschke mroeschke modified the milestones: 2.1, 3.0 Aug 5, 2023
@mroeschke
Copy link
Member

ASV is failing on github - how do I know which part(s) significantly slowed down? There is no summary and there seems to be no keyword to grep for

The CI just runs the ASVs to see that they have to work, you'll have to run asv continuous to see the comparison.

You can search for failed to see the failed benchmarks

@rhshadrach
Copy link
Member

Running ASV with --show-stderr will help identify issues. Unfortunately it also outputs anytime we raise NotImplementedError to indicate a combination of the parameters should not be run. Still, I'm thinking we should add this flag to the CI.

@twoertwein
Copy link
Member Author

I tried to locally run ASV and set the python environment up as documented. When running ASV I get errors that there is no module build:

python3 -m venv ~/virtualenvs/pandas-dev
. ~/virtualenvs/pandas-dev/bin/activate
python -m pip install -r requirements-dev.txt

cd asv_bench
# change 3.10 to 3.11 in asv.conf.json
 asv continuous -f 1.1 -E virtualenv upstream/main abstract2
Couldn't load asv.plugins.mamba because
No module named 'mamba'
· Creating environments
· Discovering benchmarks
·· Uninstalling from virtualenv-py3.11-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-python-build-scipy-sqlalchemy-tables-xlrd-xlsxwriter
·· Building a9cffe36 <abstract2> for virtualenv-py3.11-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-python-build-scipy-sqlalchemy-tables-
xlrd-xlsxwriter
·· Error running /var/home/twoertwein/git/pandas/asv_bench/env/5d81e97ed2503fcaf484d1053815874d/bin/python -m build -Cbuilddir=builddir --wheel --outdir /var/home/twoertwein/git/pa
ndas/asv_bench/env/5d81e97ed2503fcaf484d1053815874d/asv-build-cache/a9cffe36a4fd08ef4ff7bc6c1caf34831f459090 /var/home/twoertwein/git/pandas/asv_bench/env/5d81e97ed2503fcaf484d1053
815874d/project (exit status 1)
   STDOUT -------->
   
   STDERR -------->
   /var/home/twoertwein/git/pandas/asv_bench/env/5d81e97ed2503fcaf484d1053815874d/bin/python: No module named build

·· Failed: trying different commit/environment
·· Uninstalling from virtualenv-py3.11-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-python-build-scipy-sqlalchemy-tables-xlrd-xlsxwriter
·· Building 766e2fc7 <main> for virtualenv-py3.11-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-python-build-scipy-sqlalchemy-tables-xlrd-
xlsxwriter
·· Error running /var/home/twoertwein/git/pandas/asv_bench/env/5d81e97ed2503fcaf484d1053815874d/bin/python -m build -Cbuilddir=builddir --wheel --outdir /var/home/twoertwein/git/pa
ndas/asv_bench/env/5d81e97ed2503fcaf484d1053815874d/asv-build-cache/766e2fc7ceb99f7bb582f96ae8f3c3edc8b5b457 /var/home/twoertwein/git/pandas/asv_bench/env/5d81e97ed2503fcaf484d1053
815874d/project (exit status 1)
   STDOUT -------->
   
   STDERR -------->
   /var/home/twoertwein/git/pandas/asv_bench/env/5d81e97ed2503fcaf484d1053815874d/bin/python: No module named build

·· Failed: trying different commit/environment
·· Uninstalling from virtualenv-py3.11-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-python-build-scipy-sqlalchemy-tables-xlrd-xlsxwriter
·· Building ac8d34dd <abstract2~1> for virtualenv-py3.11-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-python-build-scipy-sqlalchemy-table
s-xlrd-xlsxwriter
·· Error running /var/home/twoertwein/git/pandas/asv_bench/env/5d81e97ed2503fcaf484d1053815874d/bin/python -m build -Cbuilddir=builddir --wheel --outdir /var/home/twoertwein/git/pa
ndas/asv_bench/env/5d81e97ed2503fcaf484d1053815874d/asv-build-cache/ac8d34dd9a3a05bf0c5108e808d3ab0a4e80a93b /var/home/twoertwein/git/pandas/asv_bench/env/5d81e97ed2503fcaf484d1053
815874d/project (exit status 1)
   STDOUT -------->

   STDERR -------->
   /var/home/twoertwein/git/pandas/asv_bench/env/5d81e97ed2503fcaf484d1053815874d/bin/python: No module named build

·· Failed to build the project and import the benchmark suite.

@rhshadrach
Copy link
Member

rhshadrach commented Sep 6, 2023

I was able to pull this branch and run as-is. Maybe delete the ASV cache and see if that resolves it?

Also - this is the actual removal, not the deprecation, yea? Is this just to see perf impact to decide whether we want to deprecate?

@twoertwein
Copy link
Member Author

I was able to pull this branch and run as-is. Maybe delete the ASV cache and see if that resolves it?

Thank you, I will try that (but I think it looked more like an issue with the conda-less environment).

Also - this is the actual removal, not the deprecation, yea? Is this just to see perf impact to decide whether we want to deprecate?

It is for deprecation but already removing the internal usage of AbstractMethodError: either replaced by making astractmethod (inheriting from ABC can have a perf impact but IMHO this is more explicit that subclasses have to implement a method) or raising NotImplementedError (no perf impact)

@twoertwein
Copy link
Member Author

Using mamba for the development environment seems to fix the ASV issues for me. I will run the complete ASV benchmark overnight.

@twoertwein
Copy link
Member Author

twoertwein commented Sep 21, 2023

I ran asv continuous -f 1.1 upstream/main abstract2 > log and it seems that four read_csv tested failed:

$ grep failed log
[62.20%] ··· ...CSVDatePyarrowEngine.time_read_csv_index_col             failed
[62.31%] ··· io.csv.ReadCSVIndexCol.time_read_csv_index_col              failed
[87.20%] ··· ...CSVDatePyarrowEngine.time_read_csv_index_col             failed
[87.31%] ··· io.csv.ReadCSVIndexCol.time_read_csv_index_col              failed
$ asv continuous -f 1.1 upstream/main abstract2 -b time_read_csv_index_col
· Creating environments
· Discovering benchmarks
·· Uninstalling from conda-py3.10-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-pytables-python-build-scipy-sqlalchemy-xlrd-xlsxwriter.
·· Installing 598c4f5c <abstract2> into conda-py3.10-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-pytables-python-build-scipy-sqlalchemy-xlrd-xlsxwriter.
· Running 4 total benchmarks (2 commits * 1 environments * 2 benchmarks)
[ 0.00%] · For pandas commit 2a08b052 <abstract2~2> (round 1/2):
[ 0.00%] ·· Building for conda-py3.10-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-pytables-python-build-scipy-sqlalchemy-xlrd-xlsxwriter...
[ 0.00%] ·· Benchmarking conda-py3.10-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-pytables-python-build-scipy-sqlalchemy-xlrd-xlsxwriter
[12.50%] ··· Running (io.csv.ReadCSVDatePyarrowEngine.time_read_csv_index_col--)..
[25.00%] · For pandas commit 598c4f5c <abstract2> (round 1/2):
[25.00%] ·· Building for conda-py3.10-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-pytables-python-build-scipy-sqlalchemy-xlrd-xlsxwriter..
[25.00%] ·· Benchmarking conda-py3.10-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-pytables-python-build-scipy-sqlalchemy-xlrd-xlsxwriter
[37.50%] ··· Running (io.csv.ReadCSVDatePyarrowEngine.time_read_csv_index_col--)..
[50.00%] · For pandas commit 598c4f5c <abstract2> (round 2/2):
[50.00%] ·· Benchmarking conda-py3.10-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-pytables-python-build-scipy-sqlalchemy-xlrd-xlsxwriter
[62.50%] ··· io.csv.ReadCSVDatePyarrowEngine.time_read_csv_index_col                                                                                                                                                                   failed
[75.00%] ··· io.csv.ReadCSVIndexCol.time_read_csv_index_col                                                                                                                                                                            failed
[75.00%] · For pandas commit 2a08b052 <abstract2~2> (round 2/2):
[75.00%] ·· Building for conda-py3.10-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-pytables-python-build-scipy-sqlalchemy-xlrd-xlsxwriter..
[75.00%] ·· Benchmarking conda-py3.10-Cython0.29.33-jinja2-matplotlib-meson-meson-python-numba-numexpr-odfpy-openpyxl-pyarrow-pytables-python-build-scipy-sqlalchemy-xlrd-xlsxwriter
[87.50%] ··· io.csv.ReadCSVDatePyarrowEngine.time_read_csv_index_col                                                                                                                                                                   failed
[100.00%] ··· io.csv.ReadCSVIndexCol.time_read_csv_index_col                                                                                                                                                                            failed

Those failures look unrelated to this PR:

[100.00%] ··· io.csv.ReadCSVIndexCol.time_read_csv_index_col                                                                                                                                                                            failed
[100.00%] ···· Traceback (most recent call last):
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/asv_runner/server.py", line 179, in _run_server
                   _run(run_args)
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/asv_runner/run.py", line 72, in _run
                   result = benchmark.do_run()
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/asv_runner/benchmarks/_base.py", line 661, in do_run
                   return self.run(*self._current_params)
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/asv_runner/benchmarks/time.py", line 165, in run
                   samples, number = self.benchmark_timing(
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/asv_runner/benchmarks/time.py", line 258, in benchmark_timing
                   timing = timer.timeit(number)
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/timeit.py", line 178, in timeit
                   timing = self.inner(it, self.timer)
                 File "<timeit-src>", line 6, in inner
                 File "/var/home/twoertwein/git/pandas/asv_bench/benchmarks/io/csv.py", line 606, in time_read_csv_index_col
                   read_csv(self.StringIO_input, index_col="a")
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 973, in read_csv
                   return _read(filepath_or_buffer, kwds)
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 622, in _read
                   parser = TextFileReader(filepath_or_buffer, **kwds)
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1522, in __init__
                   self._engine = self._make_engine(f, self.engine) 
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1797, in _make_engine
                   return mapping[engine](f, **self.options)
                 File "/var/home/twoertwein/git/pandas/asv_bench/env/36436ace7d7eead1c76ef118fd27f1fa/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in __init__
                   self._reader = parsers.TextReader(src, **kwds)   
                 File "parsers.pyx", line 585, in pandas._libs.parsers.TextReader.__cinit__
               pandas.errors.EmptyDataError: No columns to parse from file
               asv: benchmark failed (exit status 1)

Does ASV still highlight regressions when some tests fail, or do I have to wait until the benchmarks work again to test whether regressions exists? The log file I created just ends with

$ tail log
[99.82%] ··· timedelta.DatetimeAccessor.time_dt_accessor                196±3ns
[99.85%] ··· timedelta.DatetimeAccessor.time_timedelta_days         1.23±0.01ms
[99.87%] ··· ...DatetimeAccessor.time_timedelta_microseconds        1.15±0.05ms
[99.89%] ··· ....DatetimeAccessor.time_timedelta_nanoseconds        1.13±0.02ms
[99.91%] ··· ...elta.DatetimeAccessor.time_timedelta_seconds        1.15±0.09ms
[99.93%] ··· Setting up tslibs.timedelta:55                                  ok
[99.93%] ··· ...elta.TimedeltaProperties.time_timedelta_days            204±6ns
[99.96%] ··· ...edeltaProperties.time_timedelta_microseconds            200±2ns
[99.98%] ··· ...medeltaProperties.time_timedelta_nanoseconds            195±7ns
[100.00%] ··· ...a.TimedeltaProperties.time_timedelta_seconds            201±2ns

@rhshadrach
Copy link
Member

or do I have to wait until the benchmarks work again to test whether regressions exists?

Benchmarks are run on every commit to main and it is currently green. I don't think these should be failing. I should have some time to try over the weekend.

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Oct 23, 2023
@twoertwein twoertwein closed this Dec 26, 2023
@twoertwein twoertwein deleted the abstract2 branch January 17, 2024 02:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Refactor Internal refactoring of code Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mark methods raising AbstractMethodError as abstractmethods?
4 participants