Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add cloud testing + refactoring nominal x y replacement #35

Merged
merged 29 commits into from
May 4, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
2b9f7af
added first cloud test
jbusecke Apr 5, 2020
fb3f096
added gcsfs and zarr to ci envs
jbusecke Apr 5, 2020
7d0e710
skip cloud test for one env
jbusecke Apr 5, 2020
6e63ae7
added circleci
jbusecke Apr 5, 2020
d2c17d1
changes to the circleci config
jbusecke Apr 5, 2020
fab7fb4
revert back to ncar container build
jbusecke Apr 5, 2020
3c0c56c
added CircelCI badge
jbusecke Apr 5, 2020
eb396e7
conda init added
jbusecke Apr 5, 2020
f58d227
ignore cloud tests on travis
jbusecke Apr 5, 2020
60b4d1f
use circleci only for cloud tests
jbusecke Apr 5, 2020
1f9eab7
increase circleci time limit
jbusecke Apr 5, 2020
7e80f9d
fully install miniconda from scratch on circleci
jbusecke Apr 5, 2020
d9b5954
last try with circleci
jbusecke Apr 6, 2020
620772e
moving to other local machine for dev
jbusecke Apr 6, 2020
c2f846f
local test pass, monotonic problem fixed for MRI-HR
jbusecke Apr 6, 2020
f8260c6
local test passing for cloud and unit tests
jbusecke Apr 6, 2020
015507b
some more cloud test refinements
jbusecke Apr 6, 2020
9d56612
all cloud tests pass+added missing value removal
jbusecke Apr 6, 2020
733bad6
Merge branch 'master' into refactor_nominal_xy
Apr 6, 2020
e74deca
fixed minor bug in tests + increased time allowance for circle ci
jbusecke Apr 6, 2020
6299ead
added psutil to upstream dev env
jbusecke Apr 6, 2020
6efc4f3
added numpy to dev env
jbusecke Apr 6, 2020
03063eb
some more changes
jbusecke Apr 7, 2020
84690c3
added pytest-xdist to try and speed up tests
jbusecke Apr 28, 2020
7b8bf87
fixed dependency to pip in dev env and restricted processes to 4 for …
jbusecke Apr 28, 2020
077c1ab
Still trying to fix the upstream env and reduced the number of worker…
jbusecke Apr 28, 2020
cecf474
some more tweaks for faster tests
jbusecke Apr 28, 2020
32fd369
removed the cloud tests from the CI for now, added whats new breaking…
jbusecke Apr 29, 2020
a8e0a84
more basic tests
jbusecke May 4, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
version: 2
# Tell CircleCI to use this workflow
workflows:
version: 2
nightly:
triggers:
- schedule:
cron: "0 0 * * *"
filters:
branches:
only:
- master
jobs:
- "upstream-dev"
- "python-3.6"
- "python-3.7"
- "python-3.8"
default:
jobs:
- "upstream-dev"
- "python-3.6"
- "python-3.7"
- "python-3.8"

shared: &shared
steps:
- checkout
# - restore_cache:
# key: deps-{{ checksum "./ci/environment.yml" }}
- run:
name: Install dependencies in (base environment)
command: |
conda env update -n base -f ${ENV_FILE}
pip install -e .
- run:
name: List packages in the current environment (base)
command: |
conda list
- run:
name: Running Tests
command: |
pytest --junitxml=test-reports/junit.xml --cov=./ --verbose --ignore=cmip6_preprocessing/tests/test_preprocessing_cloud.py
no_output_timeout: 30m
- run:
name: Uploading code coverage report
command: |
codecov
- store_test_results:
path: test-reports

- store_artifacts:
path: test-reports

# - save_cache:
# key: deps-{{ checksum "./ci/environment.yml" }}
# paths:
# - "/opt/conda/pkgs"


jobs:
"upstream-dev":
<<: *shared
docker:
- image: ncarxdev/miniconda:3.7
environment:
ENV_FILE: "./ci/environment-py37-upstream-master.yml"

"python-3.6":
<<: *shared
docker:
- image: ncarxdev/miniconda:3.6
environment:
ENV_FILE: "./ci/environment-py36.yml"

"python-3.7":
<<: *shared
docker:
- image: ncarxdev/miniconda:3.7
environment:
ENV_FILE: "./ci/environment-py37.yml"

"python-3.8":
<<: *shared
docker:
- image: ncarxdev/miniconda:3.8
environment:
ENV_FILE: "./ci/environment-py38.yml"
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -62,3 +62,5 @@ target/
.python-version

.ipynb_checkpoints

**/dask-worker-space/
3 changes: 1 addition & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,7 @@ install:
- python -c "import cmip6_preprocessing; print(cmip6_preprocessing.__version__)"
- pip install -e .
script:
- py.test cmip6_preprocessing -v --cov=cmip6_preprocessing --cov-config .coveragerc --cov-report term-missing
#- pytest -v --color=yes --cov=cmip6_preprocessing tests
- py.test cmip6_preprocessing -v --ignore=cmip6_preprocessing/tests/test_preprocessing_cloud.py --cov=cmip6_preprocessing --cov-config .coveragerc --cov-report term-missing

after_success:
- conda install -c conda-forge codecov
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
[![conda-forge](https://img.shields.io/conda/dn/conda-forge/cmip6_preprocessing?label=conda-forge)](https://anaconda.org/conda-forge/cmip6_preprocessing)
[![Pypi](https://img.shields.io/pypi/v/cmip6_preprocessing.svg)](https://pypi.org/project/cmip6_preprocessing)
[![Build Status](https://travis-ci.com/jbusecke/cmip6_preprocessing.svg?branch=master)](https://travis-ci.com/jbusecke/cmip6_preprocessing)
[![Build Status](https://img.shields.io/circleci/project/github/jbusecke/cmip6_preprocessing/master.svg?)](https://circleci.com/gh/jbusecke/cmip6_preprocessing/tree/master)
[![codecov](https://codecov.io/gh/jbusecke/cmip6_preprocessing/branch/master/graph/badge.svg)](https://codecov.io/gh/jbusecke/cmip6_preprocessing)
[![License:MIT](https://img.shields.io/badge/License-MIT-lightgray.svg?style=flt-square)](https://opensource.org/licenses/MIT)
[![DOI](https://zenodo.org/badge/215606850.svg)](https://zenodo.org/badge/latestdoi/215606850)
Expand Down Expand Up @@ -36,4 +37,3 @@ Install `cmip6_preprocessing` via pip:
or conda:

`conda install -c conda-forge cmip6_preprocessing`

4 changes: 3 additions & 1 deletion ci/environment-py36.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ dependencies:
- xgcm
- pyproj
- matplotlib
- pip
- pip:
- codecov
- pytest-cov
- black
- pytest-xdist
- black
4 changes: 3 additions & 1 deletion ci/environment-py37-upstream-master.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,12 @@ dependencies:
- codecov
- pytest-cov
- black
- numpy
- pip
- pip:
- pytest-xdist
- git+https://github.com/mathause/regionmask.git
- git+https://github.com/pydata/xarray.git
- git+https://github.com/numpy/numpy.git
- git+https://github.com/pandas-dev/pandas.git
- git+https://github.com/NCAR/intake-esm.git
- git+https://github.com/xgcm/xgcm.git
3 changes: 3 additions & 0 deletions ci/environment-py37.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,13 @@ dependencies:
- numpy
- pandas
- intake-esm
- gcsfs
- zarr
- xgcm
- pyproj
- matplotlib
- regionmask # this will fail until the current version is released on conda
- black
- pytest-cov
- pytest-xdist
- codecov
3 changes: 3 additions & 0 deletions ci/environment-py38.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,13 @@ dependencies:
- numpy
- pandas
- intake-esm
- gcsfs
- zarr
- xgcm
- pyproj
- matplotlib
- regionmask # this will fail until the current version is released on conda
- black
- pytest-cov
- pytest-xdist
- codecov
111 changes: 81 additions & 30 deletions cmip6_preprocessing/preprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -673,6 +673,20 @@ def cmip6_renaming_dict():
"vertex": "vertices",
"time_bounds": "time_bnds",
},
"TaiESM1": { # this is a guess.
"x": ["i", "lon"],
"y": ["j", "lat"],
"lon": "longitude",
"lat": "latitude",
# "lev": "lev", # no 3d data available as of now
# "lev_bounds": "lev_bnds",
# "lon_bounds": "vertices_longitude",
# "lat_bounds": "vertices_latitude",
# "lon_bounds": "vertices_longitude",
# "lat_bounds": "vertices_latitude",
"vertex": "vertices",
"time_bounds": "time_bnds",
},
}
# cast all str into lists
for model in dim_name_dict.keys():
Expand Down Expand Up @@ -779,21 +793,70 @@ def promote_empty_dims(ds):
return ds


# some of the models do not have 2d lon lats, correct that.
def broadcast_lonlat(ds, verbose=True):
"""Some models (all `gr` grid_labels) have 1D lon lat arrays
This functions broadcasts those so lon/lat are always 2d arrays."""
if "lon" not in ds.variables:
ds.coords["lon"] = ds["x"]
if "lat" not in ds.variables:
ds.coords["lat"] = ds["y"]

if len(ds["lon"].dims) < 2:
ds.coords["lon"] = ds["lon"] * xr.ones_like(ds["lat"])
if len(ds["lat"].dims) < 2:
ds.coords["lat"] = xr.ones_like(ds["lon"]) * ds["lat"]
return ds


def replace_x_y_nominal_lat_lon(ds):
"""Approximate the dimensional values of x and y with mean lat and lon at the equator"""
ds = ds.copy()

def maybe_fix_non_unique(data, pad=False):
"""remove duplicate values by linear interpolation
if values are non-unique. `pad` if the last two points are the same
pad with -90 or 90. This is only applicable to lat values"""
if len(data) == len(np.unique(data)):
return data
else:
# pad each end with the other end.
if pad:
if len(np.unique([data[0:2]])) < 2:
data[0] = -90
if len(np.unique([data[-2:]])) < 2:
data[-1] = 90

ii_range = np.arange(len(data))
_, indicies = np.unique(data, return_index=True)
double_idx = np.array([ii not in indicies for ii in ii_range])
# print(f"non-unique values found at:{ii_range[double_idx]})")
data[double_idx] = np.interp(
ii_range[double_idx], ii_range[~double_idx], data[~double_idx]
)
return data

if "x" in ds.dims and "y" in ds.dims:

nominal_y = ds.lat.mean("x").data
# extract the equatorial lat and take those lon values as nominal lon
eq_ind = abs(ds.lat.mean("x")).load().argmin().data
nominal_x = ds.lon.isel(y=eq_ind).data
# pick the nominal lon/lat values from the eastern
# and southern edge, and eliminate non unique values
# these occour e.g. in "MPI-ESM1-2-HR"
max_lat_idx = ds.lat.isel(y=-1).argmax("x").load().data
nominal_y = maybe_fix_non_unique(ds.isel(x=max_lat_idx).lat.load().data)
eq_idx = len(ds.y) // 2
nominal_x = maybe_fix_non_unique(ds.isel(y=eq_idx).lon.load().data)

ds = ds.assign_coords(x=nominal_x, y=nominal_y)

ds = ds.sortby("x")
ds = ds.sortby("y")

# do one more interpolation for the x values, in case the boundary values were
# affected
ds = ds.assign_coords(
x=maybe_fix_non_unique(ds.x.load().data),
y=maybe_fix_non_unique(ds.y.load().data, pad=True),
)

else:
warnings.warn(
"No x and y found in dimensions for source_id:%s. This likely means that you forgot to rename the dataset or this is the German unstructured model"
Expand All @@ -802,22 +865,6 @@ def replace_x_y_nominal_lat_lon(ds):
return ds


# some of the models do not have 2d lon lats, correct that.
def broadcast_lonlat(ds, verbose=True):
"""Some models (all `gr` grid_labels) have 1D lon lat arrays
This functions broadcasts those so lon/lat are always 2d arrays."""
if "lon" not in ds.variables:
ds.coords["lon"] = ds["x"]
if "lat" not in ds.variables:
ds.coords["lat"] = ds["y"]

if len(ds["lon"].dims) < 2:
ds.coords["lon"] = ds["lon"] * xr.ones_like(ds["lat"])
if len(ds["lat"].dims) < 2:
ds.coords["lat"] = xr.ones_like(ds["lon"]) * ds["lat"]
return ds


def unit_conversion_dict():
"""Units conversion database"""
unit_dict = {"m": {"centimeters": 1 / 100}}
Expand Down Expand Up @@ -883,14 +930,17 @@ def correct_lon(ds):
longitude names expected to be corrected with `rename_cmip6`"""
ds = ds.copy()

x = ds["x"].data
x = np.where(x < 0, 360 + x, x)
# remove out of bounds values found in some
# models as missing values
ds["lon"] = ds["lon"].where(abs(ds["lon"]) <= 1e35)
ds["lat"] = ds["lat"].where(abs(ds["lat"]) <= 1e35)

# only correct the actual longitude
lon = ds["lon"].data
lon = np.where(lon < 0, 360 + lon, lon)

ds = ds.assign_coords(x=x, lon=(ds.lon.dims, lon))
ds = ds.sortby("x")
# then adjust lon convention
lon = np.where(lon < 0, 360 + lon, lon)
ds = ds.assign_coords(lon=(ds.lon.dims, lon))
return ds


Expand All @@ -901,14 +951,15 @@ def combined_preprocessing(ds):
ds = rename_cmip6(ds)
# promote empty dims to actual coordinates
ds = promote_empty_dims(ds)
# demote coordinates from data_variables (this is somehow reversed in intake)
ds = correct_coordinates(ds)
# broadcast lon/lat
ds = broadcast_lonlat(ds)
# replace x,y with nominal lon,lat
ds = replace_x_y_nominal_lat_lon(ds)
# shift all lons to consistent 0-360
ds = correct_lon(ds)
# demote coordinates from data_variables (this is somehow reversed in intake)
ds = correct_coordinates(ds)
# fix the units
ds = correct_units(ds)
# replace x,y with nominal lon,lat
ds = replace_x_y_nominal_lat_lon(ds)

return ds
Loading