Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/update example data code #339

Merged
merged 40 commits into from
Jan 10, 2024
Merged

Conversation

vgro
Copy link
Collaborator

@vgro vgro commented Oct 25, 2023

This PR moves the recipes for creating dummy data from the documentation to the example_data folder as code that can be run directly to reproduce the data.
I also added a page that describes the example data; at the moment this is a bit cryptic because I wasn't sure what we want this to look like. I feel like there should be some info about the assumptions available in the user documentation. Please give feedback and suggestions on what this could look like.
Finally, I updated the climate data related documentation as a more general resource with lots of links to the code.

Type of change

  • New feature (non-breaking change which adds functionality)
  • Optimization (back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)

Key checklist

  • Make sure you've run the pre-commit checks: $ pre-commit run -a
  • All tests pass: $ poetry run pytest

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

@vgro vgro marked this pull request as draft October 25, 2023 07:57
@vgro vgro requested a review from TaranRallings October 25, 2023 07:58
Copy link
Collaborator

@jacobcook1995 jacobcook1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sensible to me! In terms of the soil and litter dummy data I don't think it makes sense to talk about them much in the user documentation as they really are just random values to test that the model actually runs sensibly. Obviously, this would change if I e.g. started taking data from soil maps for the example data.

@vgro
Copy link
Collaborator Author

vgro commented Oct 26, 2023

This looks sensible to me! In terms of the soil and litter dummy data I don't think it makes sense to talk about them much in the user documentation as they really are just random values to test that the model actually runs sensibly. Obviously, this would change if I e.g. started taking data from soil maps for the example data.

Ok, that makes sense. Maybe I just make it a bit more general and easy to read :-)

@codecov-commenter
Copy link

codecov-commenter commented Dec 1, 2023

Codecov Report

Attention: 62 lines in your changes are missing coverage. Please review.

Comparison is base (d02b93b) 95.94% compared to head (db62621) 93.50%.

Files Patch % Lines
...le_data/generation_scripts/climate_example_data.py 0.00% 19 Missing ⚠️
...ple_data/generation_scripts/litter_example_data.py 0.00% 12 Missing ⚠️
...inforest/example_data/generation_scripts/common.py 0.00% 11 Missing ⚠️
..._data/generation_scripts/elevation_example_data.py 0.00% 7 Missing ⚠️
...ple_data/generation_scripts/runoff_example_data.py 0.00% 7 Missing ⚠️
...ample_data/generation_scripts/soil_example_data.py 0.00% 4 Missing ⚠️
...mple_data/generation_scripts/plant_example_data.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #339      +/-   ##
===========================================
- Coverage    95.94%   93.50%   -2.45%     
===========================================
  Files           52       59       +7     
  Lines         2614     2954     +340     
===========================================
+ Hits          2508     2762     +254     
- Misses         106      192      +86     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@TaranRallings TaranRallings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The animal "data" is sort of an oddball right now. It'll make more sense to put a lot of the detail in docs when things are further along.

@davidorme
Copy link
Collaborator

I've pushed a bunch of changes here to add some structure to the directory and to synchronize some common elements. It fails because I haven't updated the test structure yet. I'm waiting on plants being merged first but then will sort out fixing the tests and documenting the example use.

@davidorme davidorme marked this pull request as ready for review December 13, 2023 11:03
@davidorme
Copy link
Collaborator

I think this is now ready for review and in a state to merge if everyone is happy. It's probably useful to start by reading
docs/source/virtual_rainforest/example_data.md since that describes the changes in some detail.

  • I've moved from dummy to example throughout - none of the inputs are dummy for missing models anymore.
  • I've organised the files within the folder into generation_scripts, data and config subdirectories to make it clearer what all the components are.
  • I've updated the example description to include tables of variables for each component and added expandable links to the generation scripts and config TOML.

I've also expanded the config file path resolution a bit to cope with different path combinations. There is some minor Config method renaming here: we now collect config TOML files (not resolve them) and we resolve config file paths (not fix up). @alexdewar - these are changes to the system you put in place. I think it all makes sense and resolves as I expect, but it is broken on Windows + Python 3.9 for reasons I can't fathom.

@vgro - I can't seem to add you to review the PR. I don't know how GH handles multi-author PRs!

Copy link
Collaborator

@jacobcook1995 jacobcook1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just had a few minor comments.


This code creates a set of plausible values for which the
{mod}`~virtual_rainforest.models.soil.soil_model` absolutely has to function sensibly
for.Descriptions of the soil pools can be found [here](./soil/soil_details.md).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space needed between "for." and "Descriptions

# {"file": "file.txt", "var_name": "my_path"},
# {"file": "file.txt", "var_name": "my_path"},
# id="variable_not_list",
# ),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess either comment these back in or delete them

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me of the problem we had with multiple config files (one from each model) has this been resolved?

Copy link
Collaborator

@jacobcook1995 jacobcook1995 Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope that issue (#234) hasn't been resolved, multiple config files are completely fine, but all of the instructions of where to load data from have to be contained in a single config file

from virtual_rainforest.example_data.generation_scripts.common import cell_displacements

# # Load DEM in 30m resolution
# original_data = requests.get(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened here, why did you outcomment this part?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed like a lot of work to download and preprocess a fairly large file to generate 81 numbers, particularly when those numbers aren't currently meaningfully aligned to reality 😄

At some point, this will become relevant again, but to make it reproducible and easy to re-run here and now, it seemed easier to retire the recipe and just provide the values.

@vgro
Copy link
Collaborator Author

vgro commented Dec 14, 2023

@vgro - I can't seem to add you to review the PR. I don't know how GH handles multi-author PRs!

This might be because I create the PR. I added a few minor comments but it looks overall good to me

@vgro vgro merged commit dd41f74 into develop Jan 10, 2024
16 checks passed
@vgro vgro deleted the feature/update_example_data_code branch January 10, 2024 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants