-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deploy GCP bakery into Columbia-owned project #19
Comments
Awesome! Glad it is going to be useful. Tentatively, give it a go now and let me know if you hit any bumps in the road. I am keen to take an iterative approach on this if you are keen? I have run it through on my vanilla instance here using my own instructions, so that should mean it is 98% there. |
Ok I'm going to create a new GCP project for this. Charles stand by for details. |
Charles, I created a new GCP project called |
Logged in. So next step is I follow the instructions in |
@cisaacstern Yarp :) |
How is it going on your side @cisaacstern? Anything I can do to help? |
Thanks for checking in, @tracetechnical. I've been working on some other stuff and haven't gotten a chance to start this yet. Will check back with you soon! |
One comment comes to mind here. I expect that the bakery code will evolve rapidly over the next year. For for development purposes, it would actually probably be good to have two bakeries:
|
Following the README now, and adding/checking items to/off a list here as I complete them: Setup bucket
Local tooling installs
|
@tracetechnical, could we jump on video chat sometime today or tomorrow to review On a related note, looks like the Line 2 in a7ea983
... which seems ... incorrect? ADR 2 specifies that recipe contributors should provide the Also, @sharkinsspatial, I noticed in looking into this that ADR 2 doesn't seem to mention anything about |
@cisaacstern Sure! If you compile me a quick list, I may be able to explain them on here. |
Questions indented below each line. BAKERY_NAMESPACE=""
BAKERY_IMAGE="pangeo/pangeo-forge-bakery-images:pangeonotebook-2021.06.05_prefect-0.14.22_pangeoforgerecipes-0.4.0"
STORAGE_SERVICE_ACCOUNT_NAME="<ACCOUNT 1 HERE>"
CLUSTER_SERVICE_ACCOUNT_NAME="<ACCOUNT 2 HERE>"
PROJECT_NAME=""
STORAGE_NAME=""
CLUSTER_NAME=""
Thanks in advance for your help, @tracetechnical. |
Replies in bold below
|
Is it possible to use different buckets for cache vs. production data? That way permissions can be set on the bucket level, which is way simpler. The cache data should probably be private, while the production data should be public (perhaps w/ requester-pays). |
As far as I understand, the selection of cache and target buckets is done at recipe level. The terraform setup of the storage is done purely for convenience of the bakery owner. @sharkinsspatial Please correct me if this is wrong. |
Ah ok so STORAGE_NAME is only for dockerized prefect flows? How then should a bakery operator be configuring their actual data storage locations? Do those have to be created outside of terraform? |
StorageName is used by the terraform to set up a storage account + containers for flow and cache storage, and this is in turn used in the test recipe bundled in the repo. May be worth running through some user journeys to see if there is some more thinking/documentation needed around this. |
@cisaacstern With regards to In reference to |
@sharkinsspatial I will address this as a separate PR to the Azure stuff, but yeah, I think that is wise. |
Addressed in #24 |
@cisaacstern As another note on this ongoing deployment, we will also need to update |
print("Hello, World!") We were paused on this due to the lack of a Prefect account, and then (for the last few weeks) by other work. Starting back into this today! Can't wait to get it all plugged together. 🎉 |
@cisaacstern As a reference, it looks like we still haven't seen any movement from Prefect on their serialization memory issues PrefectHQ/prefect#5004 (comment) yet. |
We are now blocked by #29. @tracetechnical @sharkinsspatial, I eagerly await any insight you may have in resolving this issue! |
Status update: ☑️ I believe I now have all of the infrastructure deployed I am able to load the logs in Loki, but for now the Prefect Cloud interface seems like an easier place to browse them. One relevant (and recurring) traceback is as follows:
|
Can you turn the pangeo_forge_recipes debugging logs on? |
#37 gets docker run -it --platform linux/amd64 -v "`pwd`/kubernetes/storage_key.json":/opt/storage_key.json -e GOOGLE_APPLICATION_CREDENTIALS="/opt/storage_key.json" 5766d92a4b8d /bin/bash
Within the worker image container, I then opened a the
|
input_url_pattern = ( | |
"https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation" | |
"/v2.1/access/avhrr/{yyyymm}/oisst-avhrr-v02r01.{yyyymmdd}.nc" | |
) | |
dates = pd.date_range("2019-09-01", "2021-01-05", freq="D") | |
input_urls = [ | |
input_url_pattern.format( | |
yyyymm=day.strftime("%Y%m"), yyyymmdd=day.strftime("%Y%m%d") | |
) | |
for day in dates | |
] | |
pattern = pattern_from_file_sequence(input_urls, "time", nitems_per_file=1) | |
recipe = XarrayZarrRecipe(pattern, inputs_per_chunk=20) | |
register_recipe(recipe) |
pangeo-forge-gcs-bakery/test/recipes/oisst_recipe.py
Lines 34 to 50 in a79b2f7
storage_name = os.environ["STORAGE_NAME"] | |
fs_remote = GCSFileSystem( | |
project= os.environ["PROJECT_NAME"], | |
bucket= os.environ["STORAGE_NAME"], | |
) | |
target = FSSpecTarget( | |
fs_remote, | |
root_path=f"{storage_name}/target", | |
) | |
recipe.target = target | |
recipe.input_cache = CacheFSSpecTarget( | |
fs_remote, | |
root_path=( | |
f"{storage_name}/cache" | |
), | |
) | |
recipe.metadata_cache = target |
followed by (still within the worker image container)
for input_name in recipe.iter_inputs():
recipe.cache_input(input_name)
which produced this
Traceback
INFO:pangeo_forge_recipes.recipes.xarray_zarr:Caching input '(0,)'
INFO:pangeo_forge_recipes.storage:Caching file 'https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/201909/oisst-avhrr-v02r01.20190901.nc'
INFO:pangeo_forge_recipes.storage:Coping remote file 'https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/201909/oisst-avhrr-v02r01.20190901.nc' to cache
DEBUG:pangeo_forge_recipes.storage:entering fs.open context manager for pfcsb-bucket/cache/f11a58c4987c8c3af6c16145253b2a51-https_www.ncei.noaa.gov_data_sea-surface-temperature-optimum-interpolation_v2.1_access_avhrr_201909_oisst-avhrr-v02r01.20190901.nc
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 124, in cache_input
input_cache.cache_file(fname, **fsspec_open_kwargs)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/pangeo_forge_recipes/storage.py", line 153, in cache_file
_copy_btw_filesystems(input_opener, target_opener)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/pangeo_forge_recipes/storage.py", line 30, in _copy_btw_filesystems
with output_opener as target:
File "/srv/conda/envs/notebook/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/pangeo_forge_recipes/storage.py", line 111, in open
with self.fs.open(full_path, **kwargs) as f:
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/fsspec/spec.py", line 1010, in open
f = self._open(
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/core.py", line 1026, in _open
return GCSFile(
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/core.py", line 1129, in __init__
det = getattr(self, "details", {}) # only exists in read mode
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/fsspec/spec.py", line 1357, in details
self._details = self.fs.info(self.path)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/fsspec/asyn.py", line 91, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/fsspec/asyn.py", line 71, in sync
raise return_result
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/fsspec/asyn.py", line 25, in _runner
result[0] = await coro
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/core.py", line 610, in _info
out = await self._ls(path, **kwargs)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/core.py", line 646, in _ls
out = await self._list_objects(path, prefix=prefix)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/core.py", line 434, in _list_objects
return [await self._get_object(path)]
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/core.py", line 388, in _get_object
res = await self._call("GET", "b/{}/o/{}", bucket, key, json_out=True)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/core.py", line 330, in _call
status, headers, info, contents = await self._request(
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/decorator.py", line 221, in fun
return await caller(func, *(extras + args), **kw)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/retry.py", line 110, in retry_request
return await func(*args, **kwargs)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/core.py", line 322, in _request
validate_response(status, contents, path)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/gcsfs/retry.py", line 89, in validate_response
raise FileNotFoundError
FileNotFoundError
that looks quite similar to what we saw on the Prefect Cloud logs copied in #19 (comment)
My current guess is that for some reason the worker container's |
🤔 Ok, so this works fine. I'll try manually running the caching with |
Status update:
Calling Line 2 in 52e2297
predictably raises the same So, rather than try to debug this long-outdated version (anything could be happening here!), I moved on to trying out what appears to be the latest (stable) bakery image release. Running this container locally, I can 🎉 get a modified version of the test recipe to cache inputs to GCS. So, next action points are:
|
One additional thought:
|
I spoke too soon. The which builds the pruned NOAA OISST Zarr store to the bakery's GCS bucket import fsspec
import xarray as xr
m = fsspec.get_mapper("gs://pfcsb-bucket/target") # "pfcsb" stands for "pangeo forge columbia staging bakery"
ds = xr.open_zarr(m, consolidated=True)
ds
|
Ok! I'm convinced. I ended up just using pangeo-forge-gcs-bakery/test/recipes/oisst_recipe.py Lines 38 to 40 in 2d164b4
Then with a call to followed by import fsspec
import xarray as xr
m = fsspec.get_mapper(
"s3://Pangeo/pangeo-forge/pfcsb-test/noaa-oisst-pruned/", # the new `root_path`
client_kwargs=dict(endpoint_url="https://ncsa.osn.xsede.org"),
anon=True,
)
ds = xr.open_zarr(m, consolidated=True)
|
Thanks for all of the quick work on this repo!
We (myself and @cisaacstern) will need to actually deploy our own GCP bakery into our own project. This will be the main "production" bakery for Pangeo forge when we first launch.
This bakery will have to manage TWO public buckets:
Let us know when you think this repo is in shape for us to try to deploy. We are happy to serve as guinea pigs to work out any kinks in the deployment process.
cc @sharkinsspatial @tracetechnical
The text was updated successfully, but these errors were encountered: