Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove error check in ConfigParser #1009

Merged
merged 1 commit into from
Oct 3, 2024

Conversation

ricardoV94
Copy link
Member

I can't quite tell why this check was added, but it is causing pymc_experimental tests to fail here: https://github.com/pymc-devs/pymc-experimental/actions/runs/11140715111/job/30960085934?pr=300#step:5:1176

which as best as I could track happens when pytest import pymc (multiple times) and this function is executed: https://github.com/pymc-devs/pymc/blob/45069a966d3e123bf98efbc4f28008e31b243df8/pymc/__init__.py#L27-L46

I don't know why PyMC has to add those specific flags, but it seems like the behavior should be supported? I also don't know what pytest is doing to trigger this error.

@ricardoV94
Copy link
Member Author

CI is failing due to micromamba hanging, also seen in other PRs

@ricardoV94 ricardoV94 force-pushed the pytensor_config_bug branch from 9c68207 to aea88ec Compare October 2, 2024 11:43
@michaelosthege
Copy link
Member

This is indeed quite strange. My best guess is this:

add_basic_configvars() executes twice, which should not happen if the module has already been imported.

But there's something about _gcd_import in the logs, and it like garbage-collected import?

Maybe (for some reason) the import pytensor got GCed and then executed again alredy during test discovery?

I'm not 100 % on this, but my feeling is that the error check did it's job and we should instead make sure that this entire block only executes exactly once, even if the import got GCed.

@ricardoV94
Copy link
Member Author

ricardoV94 commented Oct 2, 2024

Two questions:

  1. How do we handle GC?
  2. Is it really a problem to try and re-define a variable that already exists in another instance? Why?

@michaelosthege
Copy link
Member

How do we handle GC?

We never cared about GC, and under normal circumstances the import executes just once. No idea what about that pmx PR made a difference.

Is it really a problem to try and re-define a variable that already exists in another instance? Why?

We attach config settings as properties on the PyTensorConfigParser:

# the ConfigParam implements __get__/__set__, enabling us to create a property:
setattr(self.__class__, name, configparam)

This can happen at any point, and from any module. By overwriting an existing parameter property, one will likely break other pieces of code that rely on the first version.

You could argue that PyTensorConfigParser has too many features, but for the advertised functionality the check makes a lot of sense.

@ricardoV94
Copy link
Member Author

ricardoV94 commented Oct 2, 2024

I don't think it's related to the PR, but how pytest (perhaps newer version) runs. I could reproduce locally by just running pytest on the tests folder.

I'll double check tomorrow

@ricardoV94
Copy link
Member Author

one will likely break other pieces of code that rely on the first version.

Isn't this just a glorified global state? There shouldn't be multiple versions other than within the temporary context manager of change flags

@michaelosthege
Copy link
Member

Isn't this just a glorified global state?

Absolutely

There shouldn't be multiple versions other than within the temporary context manager of change flags

What about all those command line flags then? They are the entry point to the global state aka the currently running process.

@ricardoV94
Copy link
Member Author

Which command flags are you thinking about? I've only ever used env variables or pytensorrc

@michaelosthege
Copy link
Member

Yes I mean those. FLOATX, COMPILE_MODE and so on

@ricardoV94
Copy link
Member Author

I still don't get what problem this is trying to avoid. Can you give an example?

@michaelosthege
Copy link
Member

config.add(
    "timeout",
    """Timeout in seconds for compilation""",
    IntParam(500, _is_gt_0),
    in_c_key=False,
)

# somewhere else, a config parameter with the same name is added
# the check will catch this and raise the error that the parameter already exists
config.add(
    "timeout",
    """Whether graph rewrites can hit a timeout.""",
    BoolParam(False),
    in_c_key=False,
)

@ricardoV94
Copy link
Member Author

ricardoV94 commented Oct 3, 2024

config.add(
    "timeout",
    """Timeout in seconds for compilation""",
    IntParam(500, _is_gt_0),
    in_c_key=False,
)

# somewhere else, a config parameter with the same name is added
# the check will catch this and raise the error that the parameter already exists
config.add(
    "timeout",
    """Whether graph rewrites can hit a timeout.""",
    BoolParam(False),
    in_c_key=False,
)

I'm not worried about that, nobody is adding config keys besides us. Since theano didn't need this check in over a decade either, can we conclude it's a mere theoretical concern not a practical/likely one?

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's hope for the best then

@ricardoV94 ricardoV94 force-pushed the pytensor_config_bug branch from aea88ec to b7beb77 Compare October 3, 2024 09:39
@ricardoV94 ricardoV94 merged commit 222cd4a into pymc-devs:main Oct 3, 2024
58 checks passed
Copy link

codecov bot commented Oct 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.74%. Comparing base (8a6e407) to head (b7beb77).
Report is 115 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1009      +/-   ##
==========================================
- Coverage   81.74%   81.74%   -0.01%     
==========================================
  Files         183      183              
  Lines       47733    47731       -2     
  Branches    11616    11615       -1     
==========================================
- Hits        39020    39018       -2     
- Misses       6518     6520       +2     
+ Partials     2195     2193       -2     
Files with missing lines Coverage Δ
pytensor/configparser.py 92.53% <ø> (-0.05%) ⬇️

... and 2 files with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants