Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge ADO latest main into github #17

Merged
merged 28 commits into from
Jan 4, 2024
Merged

Merge ADO latest main into github #17

merged 28 commits into from
Jan 4, 2024

Conversation

jzazo
Copy link
Collaborator

@jzazo jzazo commented Jan 4, 2024

Pytest and mypy passing.

elonp and others added 28 commits October 9, 2023 17:11
* `OnemlProcessorsPipelineOperationsServices.COLLECTION_TO_DICT` returns a pipeline with an in_collection and an output that exposes the collection as a dictionary.
* `OnemlProcessorsPipelineOperationsServices.DICT_TO_COLLECTION` returns a pipeline with a dictionary input and an out_collection the entries of the input dictionary as collection entries.
* `OnemlProcessorsPipelineOperationsServices.DUPLICATE_PIPELINE` takes a pipeline and returns a pipeline with multiple copies of it.
* `OnemlProcessorsPipelineOperationsServices.EXPOSE_GIVEN_OUTPUTS` takes data (dict of outputs, dict of dict of out collections) and creates a pipeline that exposes that data as output.
* `OnemlProcessorsPipelineOperationsServices.EXPOSE_PIPELINE_AS_OUTPUT` takes a pipeline, returns an identical pipeline except it has an additional output exposing the given pipeline.
* `OnemlProcessorsPipelineOperationsServices.LOAD_INPUTS_SAVE_OUTPUTS` takes a pipeline with an in_collection called `inputs_to_load` and an out_collection called `outputs_to_save` and returns a pipeline that loads the requested inputs from uris, passes them to the original pipeline, then saves the requested outputs to uris.
* `OnemlHabitatsPipelineOperationsServices.PUBLISH_OUTPUTS_AS_DATASET` takes a pipeline that loads and saves to/from uris and returns a pipeline that loads from datasets and publishes a dataset.
…atsCliDiContainer b/c cl...

allow from oneml.habitats.immunocli import OnemlHabitatsCliDiContainer b/c cli di containers should be exposed
…as input_uris and output...

test load_inputs_save_outputs when pipeline already has input_uris and output_uris
…atasets

fix services in oneml.habitats.pipeline_operations._datasets
…/c it lower-cases datase...

remove furl from oneml.habitats.pipeline_operations b/c it lower-cases dataset name in ampds://
* Refactors plugins into every component.
* Adds hydra registry for pipeline providers.
* Adds two_diamond tests building a pipeline in YAML using a python pipeline provider.
…RITE_USING_LOCAL_CACHE_F...

expose BLOB_READ_USING_LOCAL_CACHE_FACTORY and BLOB_WRITE_USING_LOCAL_CACHE_FACTORY for use by other packages
* Adds `DuplicatePipelineConf` to be able to generate duplicate pipelines in YAML.
* Adds three diamond as test case.
📝 (docs) Update documentation.
…vice

DatasetBlobStoreBaseLocationService provides the blob locations for datasets published by oneml.
It should read these from an installation level configuration, but at the moment has them hard coded.
The container name used from production and non-production datasets outside notebooks is wrong.  This PR hopefully fixes it.
fix get_relative_path when base_uri ends with /
add test that verifies mypy screams when a service id is associated with a service that is a super-type of the declared service type
…pped and add tests.

🚸 Refactor pipeline input validation for optional inputs and add tests.
✨ Add pipeline drop_inputs and drop_outputs methods
…ut_uris

test publish_outputs_as_dataset when there are no input_uris
Datasets published by oneml pipelines have a manifest.json at their root that maps output names to relative paths within the dataset.

Pipeline builders can read from uris using `OnemlProcessorsIoServices.READ_FROM_URI_PIPELINE_BUILDER`, which by itself uses `ReadFromUriProcessor`, that takes a uri and returns the read object.

This PR adds the following semantic to uris:
If the uri has a fragment (i.e. something that follows #) then it is assumed that (after removing the fragment) the uri points to a json file.  The fragment is assumed to be a dot-seprated hierarchical key into the json.  The value associated with the key should be either a relative path from the directory of the json, or an absolute uri.

Examples:
if `file:///path1/path2/index.json` holds:
```
    links:
        rel: path3/array.npy
        abs: ampds://mydataset/
        another_abs: ampds://mydataset/manifest.json#entry_uris.container1?namespace=mynamespace
```
and `ampds://mydataset/manifest.json` holds:
```
    entry_uris:
        container1: containers/container1
```

Then:
* `file:///path1/path2/index.json#links.rel` becomes `file:///path1/path2/path3/array.npy`
* `file:///path1/path2/index.json#links.abs` becomes `ampds://mydataset/`
* `file:///path1/path2/index.json#links.another_abs` becomes `ampds://mydataset/containers/container1?namespace=mynamespace`

The semantics are implemented in `ReadFromUriProcessor` and therefore applicable only to pipelines built directly or indirectly using `OnemlProcessorsIoServices.READ_FROM_URI_PIPELINE_BUILDER`.
Copy link

github-actions bot commented Jan 4, 2024

Test Results: oneml-pipelines

31 tests  +1   31 ✅ +1   0s ⏱️ ±0s
 1 suites ±0    0 💤 ±0 
 1 files   ±0    0 ❌ ±0 

Results for commit bb6ebae. ± Comparison against base commit a6e6817.

Copy link

github-actions bot commented Jan 4, 2024

Test Results: oneml-processors

112 tests  +35   112 ✅ +35   4s ⏱️ -1s
  1 suites ± 0     0 💤 ± 0 
  1 files   ± 0     0 ❌ ± 0 

Results for commit bb6ebae. ± Comparison against base commit a6e6817.

Copy link

github-actions bot commented Jan 4, 2024

Test Results: oneml-habitats

46 tests  +40   46 ✅ +40   4s ⏱️ -1s
 1 suites ± 0    0 💤 ± 0 
 1 files   ± 0    0 ❌ ± 0 

Results for commit bb6ebae. ± Comparison against base commit a6e6817.

@jzazo jzazo merged commit c4cc4dc into main Jan 4, 2024
@jzazo jzazo deleted the jzazo/ado branch January 4, 2024 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants