Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERA5T Support - Cloud Run Job Migration #95

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

DarshanSP19
Copy link
Collaborator

@DarshanSP19 DarshanSP19 commented Feb 14, 2025

In the current ARCO-ERA5, files are updated from ECMWF on a monthly cadence (on roughly the 9th of each month) with a 3 month delay. This PR added support of ERA5T, in which files are updated on a daily cadence with a 6 days delay.

This new ARCO-ERA5 working into the 3 CRON-JOB.

  1. Running cron-job daily which download ERA5T data for the AR & CO(Model Level) - ( Download data -> 6 day behind the current_day).
  2. Running cron-job on monthly(6th day of every month) which download ERA5T data for the CO(Single Level) - ( Download data -> previous last month).
  3. Running cron-job on monthly(9th day of every month) which download ERA5 data for all - ( Download data -> Third previous month).
    • Download raw data at temp_location.
    • Compare the ERA5 data with the ERA5T & if found the difference, update it.
    • Update raw ERA5T data which downloaded through step 1 & 2 with this ERA5.
  • Shifting the jobs from google compute engine to cloud run jobs.

Note: Actually we got the data 5 day behind the current_day but we took 1 day's buffer for the safety purposes.

Fix: #82

Copy link

google-cla bot commented Feb 14, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@DarshanSP19 DarshanSP19 force-pushed the era5t-support branch 8 times, most recently from de52a0d to f200bb9 Compare February 17, 2025 07:02
@DarshanSP19 DarshanSP19 self-assigned this Feb 19, 2025
ENV PATH /opt/conda/envs/${CONDA_ENV_NAME}/bin:$PATH
RUN pip install -e .

ARG arco_era5_git_rev=era5t-support
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the branch name before merging this PR.

@DarshanSP19 DarshanSP19 marked this pull request as ready for review February 24, 2025 06:28
Copy link
Collaborator

@dabhicusp dabhicusp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@DarshanSP19 DarshanSP19 requested a review from shoyer February 24, 2025 06:34
@@ -15,8 +15,9 @@ name: era5
channels:
- conda-forge
dependencies:
- python=3.9.17
- python=3.8.19
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Python 3.8 is concerning to me. Python 3.8 is very old, already at end of life. This means it no longer receives security updates: https://devguide.python.org/versions/

Even Python 3.9 will soon be at end of life. We should strive to use newer versions of Python which are actively maintained.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed @shoyer. The plan is to remove this environment.yml and Docker entirely from the repository. As most of the code is working without these on dataflow. Also we'll definitely upgrade with the newer python version and latest zarr version (v3) in future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch ARCO-ERA5 to update daily with a ~5 day delay
3 participants