Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QST: Is there a work-around for repeated FutureWarnings? #50603

Open
2 tasks done
stev-0 opened this issue Jan 6, 2023 · 8 comments
Open
2 tasks done

QST: Is there a work-around for repeated FutureWarnings? #50603

stev-0 opened this issue Jan 6, 2023 · 8 comments
Labels
Usage Question Warnings Warnings that appear or should be added to pandas

Comments

@stev-0
Copy link

stev-0 commented Jan 6, 2023

Research

  • I have searched the [pandas] tag on StackOverflow for similar questions.

  • I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

https://stackoverflow.com/questions/75029437/pandas-get-single-warning-message-per-line

Question about pandas

Is repeated warnings expected behaviour when calling the same code multiple times? This is probably more likely to be a Python bug, but to date I haven't discovered how Pandas triggers it and if there is any work-around.

Pandas 1.5.2, Python3.10

import pandas as pd
import warnings

warnings.simplefilter(action='default', category=FutureWarning)

d = {'col1': [1, 2], 'col2': [3, 4], 'col3': [5, 6]}
df = pd.DataFrame(data=d)
a = ['col1','col2']
df[a]

b = set(a)
for i in range(3):
  df[b]

Output of python3 -Wd::FutureWarning test.py

/home/stephen/test.py:16: FutureWarning: Passing a set as an indexer is deprecated and will raise in a future version. Use a list instead.
df[b]
/home/stephen/test.py:16: FutureWarning: Passing a set as an indexer is deprecated and will raise in a future version. Use a list instead.
df[b]
/home/stephen/test.py:16: FutureWarning: Passing a set as an indexer is deprecated and will raise in a future version. Use a list instead.
df[b]

@stev-0 stev-0 added Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Jan 6, 2023
@phofl
Copy link
Member

phofl commented Jan 6, 2023

Hi, thanks for your report. This is expected. I would suggest trying to fix the deprecated stuff. We will release 2.0 soon and this will raise then

@stev-0
Copy link
Author

stev-0 commented Jan 6, 2023

Hi, yes I understand the warning needs to be heeded, my question was more around the why are there 3 copies of it, as the warnings module is supposed to supress multiple warnings emanating from the same line of code by default.

Compare the output of:

import warnings

warnings.simplefilter(action='default')

for i in range (3):
  warnings.warn('Warn once')

and

import warnings

warnings.simplefilter(action='always')

for i in range (3):
  warnings.warn('Warn once')

and you'll see what I mean

@rhshadrach
Copy link
Member

rhshadrach commented Jan 7, 2023

You could always surround the code with (better - use a context manager)

warnings.simplefilter(action='ignore', category=FutureWarning)
...
warnings.simplefilter(action='default', category=FutureWarning)

Do you still see performance degradation doing this?

@stev-0
Copy link
Author

stev-0 commented Jan 9, 2023

I could, but that would give me no warnings. I want exactly one per call in my code that calls the relevant bits of pandas with context that needs a warning.

@rhshadrach
Copy link
Member

Understood, but I'm more interested in if the performance degradation is fixed by this. Then we can improve upon it, e.g.

with warnings.catch_warnings(record=True) as record:
    [your code]
if len(record) > 0:
    saw_warning = False
    match = re.compile("string to match warning")
    for warning in record:
        if re.match(match, str(warning.message)):
            saw_warning = True
        else:
            warnings.warn_explicit(
                message=warning.message,
                category=warning.category,
                filename=warning.filename,
                lineno=warning.lineno,
            )
    if saw_warning:
        warnings.warn_explicit(
            ...
        )

@stev-0
Copy link
Author

stev-0 commented Jan 11, 2023

Yes, in my case the performance degradation would be fixed by emitting 0 (or 1 warnings) per called line.

I thought that python warnings already had this functionality (it can cache what has already been warned and reject warnings that match the same message and code location). But it just doesn't seem to be working in this case.

From https://docs.python.org/3/library/warnings.html:

There are two stages in warning control: first, each time a warning is issued, a determination is made whether a message should be issued or not; next, if a message is to be issued, it is formatted and printed using a user-settable hook.

and

action is one of the following strings:
"default": print the first occurrence of matching warnings for each location (module + line number) where the warning is issued

Just wondering whether it's the fact that you haven't given a registry argument in your example above or in

warnings.warn_explicit(
.. The docs state if you use warn_explicit you need to provide the registry:
https://docs.python.org/3/library/warnings.html#warnings.warn_explicit

if no registry is passed, the warning is never suppressed

@rhshadrach
Copy link
Member

rhshadrach commented Jan 12, 2023

@stev-0 I tried supplying registry=__warningregistry__ in all cases where we use warn_explicit but the issue persists. It appears to me you found the root cause with the Python bug (linked in the OP) already.

Still - I'll put up a PR to add this to the places where we use warn_explicit; thanks for finding this!

@rhshadrach
Copy link
Member

Actually - I'm finding that __warningregistry__ doesn't exist in e.g. _exceptions.py but it does exist when I run a script like python main.py inside of main.py. I'm finding almost no documentation on this, will have to dig deeper.

But - because the OP example with this doesn't fail outright, it's not being hit anyways.

@lithomas1 lithomas1 added Warnings Warnings that appear or should be added to pandas and removed Needs Triage Issue that has not been reviewed by a pandas team member labels May 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Usage Question Warnings Warnings that appear or should be added to pandas
Projects
None yet
Development

No branches or pull requests

4 participants