-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor: decythonize configuration space #321
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there. I just started having a look into this. Thanks a lot for cleaning the code! I was just wondering whether we should touch networkx at all since these are verbatim copies from the original code, and we don't intend to change anything in these files any more.
The only thing touched was formatting. I don't think it matters much except not formatting it just means we need to add some linting rules to ignore it. I don't really mind either way. Let me know your decision :) |
Then let's format them to make our workflow files easier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very partial review number one.
@@ -153,7 +155,8 @@ def Float( | |||
log=log, | |||
meta=meta, | |||
) | |||
elif isinstance(distribution, Normal): | |||
|
|||
if isinstance(distribution, Normal): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this considered more readable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's certainly a possible point of debate, I just went with a default of ruff
which say it's unnecessary. There are cases where it is more readable in my opinion but here I have no real strong say.
Unreviewable - Just the summary to read
This PR essentially converts
configuration_space.pyx
andutil.pyx
into their equivalent.py
files. This was done at no cost according to our benchmark (see bottom). This lets Language Servers, the "smarts" behind most editors actually be able to auto-complete, jump to definition and show docs etc...A benefit of making the code as much python as possible is for an easier time for contributions, either ourselves or others. Tooling is getting smarter and better but it does not interop with cython.
Part of this also ended up just factoring and cleaning up
configuration_space.py
into two seperate files, namelyconfiguration_space.py
andconfiguration.py
. I went through and clean up a lot of old school string formatting, turned the errors into reusable rather than copied strings and just some small non-functional changes cleanup.While going through this code, I noticed a lot of redundancy with respect to methods available and behaviours. For example, take a look at #265.
As both
ConfigurationSpace
andConfiguration
act as mappings fromstr -> Hyperparameter
andstr -> Value
respectively, interfacing with them as mappings should be the prefered way to do so.Mapping
s are by their interface, non-mutable and so this should remove unintended side effects while keeping behaviour similar to objects that people are familiar with.Beyond this point was then fixing up errors due to
mypy
andflake8
. I updated the stack to my preffered trioblack
,ruff
,mypy
, which revealed some subtle bugs in the code (e.g. non-existing variables with edge-case loops) and some slight mis-typings. Since users can now utilize the types, I went through and fixed these.Lastly, I enabled dependabot and @Neonkraft, you've been selected as tribute to hit merge on it's miny PR's to update the github workflows :)
Moving forward, I'm really not sure these benchmarks are enough to tackle any work on the actual de-cythonizing of the hyperparameters. In an ideal world, we simply move the expensive hot-loop of sampling to cython and have the rest of the class be normal ass python, but I'm not sure if we'll notice any degradation doing so.
Loose Benchmark dump