Skip to content

Commit

Permalink
Merge pull request #279 from roytman/kfpv2_minor_fixes
Browse files Browse the repository at this point in the history
update docs for KFPv2
  • Loading branch information
roytman authored Jun 16, 2024
2 parents 23c59f7 + 73c7ae7 commit 0e0ecd6
Show file tree
Hide file tree
Showing 9 changed files with 59 additions and 40 deletions.
1 change: 1 addition & 0 deletions .make.versions
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,4 @@ INGEST_TO_PARQUET_VERSION=0.4.0$(RELEASE_VERSION_SUFFIX)
INGEST_TO_PARQUET_RAY_VERSION=0.4.0$(RELEASE_VERSION_SUFFIX)

KFP_DOCKER_VERSION=0.2.0$(RELEASE_VERSION_SUFFIX)
KFP_DOCKER_VERSION_v2=0.2.0$(RELEASE_VERSION_SUFFIX)
44 changes: 30 additions & 14 deletions kfp/RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,37 +2,53 @@

This document describes the release process for the following components:

- `fm_data_processing_kfp` Python package in `kfp_support_lib` directory.
- `kfp_support_lib` Python packages in `kfp_support_lib` directory.

- `kfp-data-processing` docker image built based on the Docker file in `kfp_ray_components` directory.

- kubeflow pipelines in `transform_workflows` directory. For example the one that is generated from `transform_workflows/universal/noop/noop_wf.py` file.

**Note:** The docker image is dependent on the library thus it is required to build a new docker image once a new library version is created.
### 1. Update `.make.versions` file

### 1. Update `requirements.env` file
The [.make.versions](../.make.versions) file specifies the target versions for the building components, as well as the
desired versions for the dependencies.
The KFP package build uses the following variables from the file:
- RELEASE_VERSION_SUFFIX - the common suffix for all building components
- DPK_LIB_KFP_VERSION - the version of `kfp_v1_workflow_support`
- DPK_LIB_KFP_VERSION_v2 - the version of `kfp_v2_workflow_support`
- DPK_LIB_KFP_SHARED - the version of `kfp_shared_workflow_support`
- KFP_DOCKER_VERSION - the docker image version of KFP components for KFPv1
- KFP_DOCKER_VERSION_v2 - the docker image version of KFP components for KFPv2

The [requirements.env](requirements.env) file specifies the target versions for the building components, as well as the desired versions for the dependencies.
All components names in the requirement file must be in uppercase, for example: `KFP_DOCKER_TAGNAME=0.0.8`.
**Note:** The docker images are dependent on the libraries but use the python source code from the repository, so inorder
to build docker images, the python modules (libraries) do not have to be deployed.

Upon component version update, modify the [`requirements.env`](./requirements.env) file.
### 2. Choose the supported KFP version.
The docker images and some `workflow_support` libraries depend on KFP version. In order to build images and libraries for
KFP v2, run the following command:

```shell
export KFPv2=1
```

### 3. (Optional) Build the library

Run the `make -C kfp_support_lib build` command to build the `fm_data_processing_kfp` library if a new version should be created
Run the `make -C shared_workflow_support build` command to build the shared library.
If you need a library for KFPv1
Run `make -C kfp_v1_workflow_support build`
For KFP v2 set the environment variable `KFPv2`, se above, and run `make -C kfp_v2_workflow_support build`

### 4. (Optional) Publish the library

Run `make -C kfp_support_lib publish` command to push the library to the TestPyPI repository.
Run `make -C shared_workflow_support publish`, and either `make -C kfp_v1_workflow_support publish` or
`make -C kfp_v2_workflow_support publish`command to push the libraries to the TestPyPI repository.

### 5. Build the image

Run `make -C kfp_ray_components build` command to build the `kfp-data-processing` docker image.
### 5. Build the image

Run `make -C kfp_ray_components build` command to build the `kfp-data-processing` docker image, or `kfp-data-processing_v2`,
when `KFPv2==1`

### 5. Push the image

Run `make -C kfp_ray_components publish` command to push the `kfp-data-processing` docker image.


To generate new kubeflow pipeline yaml files run `make -C transform_workflows build` command.
Run `make -C kfp_ray_components publish` command to push the docker image.
10 changes: 5 additions & 5 deletions kfp/kfp_ray_components/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,12 @@ set-versions: reconcile-requirements
.PHONY: reconcile-requirements
reconcile-requirements:
@# Help: Update yaml files to build images tagged as version $(KFP_DOCKER_VERSION)
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" createRayClusterComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" deleteRayClusterComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" executeRayJobComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" executeRayJobComponent_multi_s3.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${DOCKER_IMAGE_VERSION}/" createRayClusterComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${DOCKER_IMAGE_VERSION}/" deleteRayClusterComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${DOCKER_IMAGE_VERSION}/" executeRayJobComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${DOCKER_IMAGE_VERSION}/" executeRayJobComponent_multi_s3.yaml
# TODO remove it for KFPv2
sed -i.back "s/kfp-data-processing*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" executeSubWorkflowComponent.yaml
sed -i.back "s/kfp-data-processing*:[0-9].*/$(DOCKER_IMAGE_NAME):${DOCKER_IMAGE_VERSION}/" executeSubWorkflowComponent.yaml

.PHONY: load-image
load-image:
Expand Down
1 change: 1 addition & 0 deletions kfp/kfp_ray_components/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# KFP components
# KFP components

All data processing pipelines have the same `shape`. They all compute execution parameters, create Ray cluster,
execute Ray job and then delete the cluster. With the exception of computing execution parameters all of the steps,
Expand Down
3 changes: 3 additions & 0 deletions kfp/kfp_support_lib/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ It comprises 3 main modules

Depends on the using KFV version either `kfp_v1_workflow_support` or `kfp_v2_workflow_support` should be used.

See also how these libraries are used for [kfp components](../../kfp_ray_components/README.md) implementation
and implementation of the actual [workflow](../../doc/simple_transform_pipeline.md)

## Development

### Requirements
Expand Down
9 changes: 0 additions & 9 deletions kfp/kfp_support_lib/doc/kfp_support_library.md

This file was deleted.

3 changes: 3 additions & 0 deletions kfp/kfp_support_lib/kfp_v1_workflow_support/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Workflow Support Library that depends on KFPv1

This provides support for implementing KFP pipelines automating transform's execution.
3 changes: 3 additions & 0 deletions kfp/kfp_support_lib/kfp_v2_workflow_support/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Workflow Support Library that depends on KFPv2

This provides support for implementing KFP pipelines automating transform's execution.
25 changes: 13 additions & 12 deletions kfp/kfp_support_lib/shared_workflow_support/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
# Shared Workflow Support

This provides support for implementing KFP pipelines automating transform's execution.
It comprises 2 main modules
This provides support for implementing KFP pipelines automating transform's execution. This library is not dependent on
KFP version. KFP dependent modules are in [kfp_v1_workflow_support](../kfp_v1_workflow_support) and
[kfp_v2_workflow_support](../kfp_v2_workflow_support)

* [python apiserver client](src/python_apiserver_client/README.md)
* [workflow support](src/runtime_utils/README.md)
this module combines 2 inner modules

* [python apiserver client](src/python_apiserver_client/README.md), which is a copy of
[Kuberay API server-client python APIs](https://github.com/ray-project/kuberay/tree/master/clients/python-apiserver-client)
We added it into the project, because these APIs are not exposed by any PyPi.
* [runtime_utils](src/runtime_utils/README.md)

## Development

Expand All @@ -26,13 +31,11 @@ If you don't have pre-commit, you can install from [here](https://pre-commit.com

## Library Artifact Build and Publish

The process of creating a release for `fm_data_processing_kfp` package involves the following steps:

cd to the package directory.

update the version in [requirements.env](../../requirements.env) file.
The process of creating a release for the package involves the following steps:

run `make build` and `make publish`.
- cd to the package directory.
- update the `DPK_LIB_KFP_SHARED` version in [.make.versions](../../../.make.versions) file.
- run `make build` and `make publish`.

## Testing

Expand Down Expand Up @@ -64,5 +67,3 @@ previous test runs resources are removed before starting new tests.
```bash
kubectl delete workflows -n kubeflow --all

This is a copy of [Kuberay API server-client python APIs](https://github.com/ray-project/kuberay/tree/master/clients/python-apiserver-client)
Because these APIs are not exposed by any PyPi, we added them to the project

0 comments on commit 0e0ecd6

Please sign in to comment.