-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialize data version control for managing test images #5888
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will let others do the initial approval on this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long delay! I only have a few minor comments for now, the dvc setup and instructions seem mostly ok for me, but I'll need to run a proper git clone
of this branch and do some proper tests in the coming days. Disk space on my 10 year old laptop isn't great for big repos, so I'll need to use some university resources 🙂
Co-authored-by: Wei Ji <[email protected]>
No worries on the timing - I really appreciate your willingness to give this a review! |
Would anyone be willing to give this a review shortly? I think it would be great to make a decision about whether to go with dvc before we update the baseline images for the ~25 failing tests. Here are some generally instructions for testing out the changes:
Afterwards, you would need to do some cleanup (this wouldn't be necessary when switching branches if this PR gets merged in):
One note: migrating all the tests files to dvc would initially add ~100 MB to the repository size for people who run GMT tests locally (see #5724 for details) if we don't also delete git history (which I would prefer avoiding). For anyone who doesn't run |
I will give it a go following the instructions - will slack if running into troubles! |
Thanks Paul! |
Was able to successfully do all 7 steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't found the time to run the full tests, but the documented dvc
steps look ok to me, so won't hold this back any longer. 🚀
@PaulWessel do you approve of merging this in now? If so, I could use dvc for the postscript files in #6199 and I could update the reference images for the failing tests using dvc. |
Yes, I think that would be great. Go ahead. |
Description of proposed changes
This PR implements the solution for storing baseline test images proposed in #5724. Only a file describing the contents of a directory with baseline images are stored in the git repository; the actual PS file will be stored in the DAGsHub repository (e.g., https://dagshub.com/GenericMappingTools/gmt/src/manage-tests-dvc-refactor/test/baseline/api). While by default DAGsHub will store all versions of the images that are pushed to the remote storage, these can be easily garbage collected from either the user's local cache and the remote storage (if we reach the 10 GB limit) without breaking the git history.
These are the specific changes in this PR:
Addresses #5724
Reminders