-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add variables setup #388
Add variables setup #388
Conversation
I'm not in a hurry, but do you have any comments on this, specially in what I mention in the "What is known to be wrong or missing" section? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely looks like what we're after. Some comments below. On this part of your PR message:
At the moment, we use the models required_init_vars to decide which variable is initialised by each model, but that is not what required_init_vars represents, but rather what variable needs to be in the data object to be able to initialise the model. So, in reality, they are more like vars_used, a new attribute that needs to be populated for each model. Likewise, the vars_updated might actually be more closely related to what the model is initialising, although maybe not in all cases.
Assuming we're only dealing with the init
and update
methods:
required_init_vars
are variables that must be either provided through the run configuration or created by a previously initialised model. (I am vaguely wondering here whether there's mileage in having aVariable.initialise(self, data: Data)
method that allows a variable initialisation to be importable across modules. That would save code duplication).- We probably need
required_update_vars
- variables that are required for theupdate
method. - We have
vars_updated
but needvars_initialised
.
virtual_rainforest/core/config.py
Outdated
import virtual_rainforest.core.variables as variables | ||
from virtual_rainforest.core.exceptions import ConfigurationError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're using this rather than from virtual_rainforest.core.variables import setup_variables
to keep the use of the variables
namespace clear in the flow? It's functionally identical (right?) - I'm only asking because the style differs from how we've imported other functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not identical. The method I used does not import the contents of the module until setup_variables
is used deep in the code. This avoids circular import errors since by the time it is necessary to really load the contents of variable
all the dependencies are already loaded.
And it keep the namespace clean 😃
virtual_rainforest/core/variables.py
Outdated
list(getmembers(variables_submodule)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a return value but it isn't. Is the process of listing the members triggering the __post_init__
for each Variable
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I could not figure out a more elegant way of loading the contents of a module so the variables are registered. Any suggestion is most welcomed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has now been ditched.
virtual_rainforest/core/variables.py
Outdated
RUN_VARIABLES_REGISTRY: dict[str, Variable] = {} | ||
"""The global registry of variables used in a run.""" | ||
|
||
KNOWN_VARIABLES: dict[str, Variable] = {} | ||
"""The global known variable registry.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I fully get the difference between these two registry objects. KNOWN_VARIABLES
is only populated when a module is registered - and that only happens when the configuration of a run requests that module, so the keys of these should be identical within a run?
On the flip-side of that, we need a mechanism that allows the docs and data_variables.toml
to be populated and those should contain all known models, not just the models in a specific run. We could just include model.variables.py
in the autodoc for each model documentation, but what we really want is a single variables page, so having a function in variables
that explicitly populates data_variables.toml
from everything in the models
submodule. We can then use that in sphinx
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I get it. It should have contained all possible variables, but clearly I put the registration in the wrong place.
I will look into this.
It's really late in the day to be moaning about this, but I find I constantly stumble over the meanings of the model variable attributes. Does anyone else find this - if so could we switch up the names?
|
…lated by first update
…ented out more 'experimental' axis definitions.
OK - this has basically been me smacking it repeatedly with a stick and swearing at it. I think we should get a logo for the VE that is just a particularly grouchy looking donkey. I don't think I've done anything mad but @vgro and @TaranRallings please have a look at the changes to
I haven't got my head around why |
Good morning, you were busy last night!
|
This conversation definitely highlights the value of your contribution @dalonsoa , thank you :-) |
No problem changing the names - just tell me what you want, and I'll change it :)
That's right. That's the point of the validation process done in the variables system, to avoid having two models initialising the same variables. It was mentioned in the comments that the adiabatic models were also initialising those, but there was no suggestion of a solution (at lease not one I can implement without knowing the science). The adiabatic models and the hydrology models are not parallel models (like |
A solution would be to make sure that the It is in the update step where it makes more sense to have the hydrology first to get the current soil moisture |
Could we indicate in the config file that
|
ah, this is what I was looking for. I think we want init in this order: which should look like this:
Also, I just noticed that the infamous can you confirm this @davidorme ? |
Actually, we don't need to worry about the order. If we indicate correctly what variables are initialised by what model and when, then we can use the information stored in the variables registry to come up with the right sequence for the init and the update. We don't need to indicate that in the config file - and we can completely ditch that part. At least when it comes to the initialisation. |
Yeah - this is one of the main advantages of the variables system, I think. It can automatically establish if there is a feasible variable setup sequence for any given suite of models and give meaningful errors about why no sequence can be established. |
I think the same should be true for the update? Going into |
I'm not sure there is a right and wrong way here - you could calculate these variable within either model? We can only initialise or populate them in a single model, so we just have to pick one and do it and that comes down to whichever is the cleanest or most logical code or where the calculations fit better within the broader scientific theory. There are are always going to be things on the boundaries between models 😄 |
@jacobcook1995 gave this the thumbs up, so unless anyone else is wildly bothered, I say we go for those switches. It probably makes sense to update the |
I've a nasty feeling that we'll need to add to variables system then to track spinup variables and the order of execution of |
Do we want to do this in this issue? It's becoming pretty large and I feel it would be best to consolidate what we have before starting to use the new functionality to improve the workflow. |
I'm going to make this change in this PR, right now. Let's see if I don't break anything else... |
No. Nope. Nix. Hard pass.
☝️ 👆 This. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Time to bank?
Yes, please!!! 🙏 |
yes, go for it 👍 |
oh you already did, perfect :-) Happy weekend! |
Description
Supersedes #371
This PR adds a more full featured version of the variables infrastructure. It adds two registries, one containing all the known variables based on the modules that are being used in the simulation and another one for runtime variables that are actually being used by the requested models.
Each variable can be initialised by one single model and can be updated (with a warning) and used by more than one model. Who initialises, updates and use each variable is added to the relevant attributes of each variable.
The availability of the axis each variable requires is also validated.
What is known to be wrong or missing
At the moment, we use the models
required_init_vars
to decide which variable is initialised by each model, but that is not whatrequired_init_vars
represents, but rather what variable needs to be in thedata
object to be able to initialise the model. So, in reality, they are more likevars_used
, a new attribute that needs to be populated for each model. Likewise, thevars_updated
might actually be more closely related to what the model is initialising, although maybe not in all cases.Keeping this confusion aside, all models (including the core ones) will need to define their own
variable.py
module with the relevant variables. An example for one variable used by theplants
model is included to show how that needs to be done. To implement that, it might be easier if each of you create such a variable module for each of your models and open a PR against this one.Needless to say, tests and docs need to be implemented. That will be done once we are happy with the implementation. Until all the above is sorted, test will fail, naturally.
Fixes # (issue) - partly addresses #371
Type of change
Key checklist
pre-commit
checks:$ pre-commit run -a
$ poetry run pytest
Further checks