Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix caching of gt_cache directories on CircleCI #327

Merged
merged 6 commits into from
Sep 16, 2022
Merged

Conversation

mcgibbon
Copy link
Collaborator

@mcgibbon mcgibbon commented Sep 15, 2022

Purpose

This PR fixes an issue where gt4py cache directories were not cached between executions on CircleCI.

Infrastructure changes:

  • Savepoint make targets now run python3 -m gt4py.gt_src_manager install before running tests inside docker, to avoid issues arising from git cloning these sources in a MPI-parallel context

Checklist

Before submitting this PR, please make sure:

  • You have followed the coding standards guidelines established at Code Review Checklist.
  • Docstrings and type hints are added to new and updated routines, as appropriate
  • All relevant documentation has been updated or added (e.g. README, CONTRIBUTING docs)
  • For each public change and fix in pace-util, HISTORY has been updated
  • Unit tests are added or updated for non-stencil code changes

@mcgibbon mcgibbon marked this pull request as ready for review September 15, 2022 23:11
Dockerfile Outdated
@@ -10,9 +10,11 @@ RUN apt-get update && apt-get install -y make \
netcdf-bin \
libnetcdf-dev \
python3 \
python3-pip
python3-pip \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert the changes to this file, they didn't help.

@mcgibbon
Copy link
Collaborator Author

cpu mpi savepoint tests now pass in 5 minutes when fully cached, instead of 1.5h.

@mcgibbon mcgibbon changed the title Run tests in DEV=y mode to save gt cache Fix caching of gt_cache directories on CircleCI Sep 15, 2022
@mcgibbon mcgibbon requested a review from elynnwu September 16, 2022 00:10
@mcgibbon mcgibbon enabled auto-merge (squash) September 16, 2022 00:10
paths:
- .gt_cache
- .gt_cache_000000
Copy link
Collaborator

@elynnwu elynnwu Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought GT_CACHE_ROOT would specify the cache directory?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't set GT_CACHE_ROOT for this plan, though?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, looks like only for running 54 ranks mpi test

@@ -24,6 +24,7 @@ PULL ?=True
DEV ?=y
CHECK_CHANGED_SCRIPT=$(CWD)/changed_from_main.py
CONTAINER_CMD?=docker
SAVEPOINT_SETUP=pip3 list && python3 -m gt4py.gt_src_manager install
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at some point this was part of the dockerfile? Also, I see that this is called during set up environment in circleci, does that not work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complicated case is that on the CircleCI test, we:

  • Cannot run this outside of the container, because the environment is not set up / installed (I could add it just to run this command, but that feels like overkill)
  • Are running the MPI-parallel tests, so if the process itself tries to clone these sources it happens 6 times in parallel, which leads to errors
  • Need to bind-mount our directory into the container so that we can retain the caches, but doing so over-writes the gt4py directory (including gridtools sources) with our directory that does not have these

I could turn this into a ?= and modify it only for the CircleCI tests, if that would be better? Normally this isn't an issue because we have/can get the gridtools sources on our local filesystem copy of gt4py.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, just to double check, if we already had gridtools installed, this wouldn't do anything right?

@mcgibbon
Copy link
Collaborator Author

jenkins fail run at https://jenkins.ginko.ch/job/pace_PR/2852/ relaunching to check if it's persistent or random

@mcgibbon
Copy link
Collaborator Author

launch jenkins

@mcgibbon mcgibbon merged commit 4ac0377 into main Sep 16, 2022
@mcgibbon mcgibbon deleted the fix/circleci_caching branch September 16, 2022 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants