Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kubeflow Components Overview Diagram #3650

Merged
merged 11 commits into from
Jan 9, 2024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we store the drawio source file as well so it's easier to modify later

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyvelich Also, I dont know why/how but this image is somehow interactively linked to draw.io: https://deploy-preview-3650--competent-brattain-de2d6d.netlify.app/docs/started/architecture/

I recommend we raster it down to an SVG to ensure compatibility.

Copy link
Member

@thesuperzapper thesuperzapper Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, your diagram should make it clear that any Kubernetes cluster will work, so perhaps we should replace the minikube logo with the Kubernetes one?

Or, perhaps the Kubernetes logo should replace the "SDKs", "Web UI", and "Kubectl" layer, to indicate that Kubeflow runs on Kubernetes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think the components should be ordered by popularity, or possibly the order in which you use them.

For example:

  • Top-Left: Kubeflow Pipelines
  • Top-Right: Kubeflow Notebooks
  • Mid-Left: Central Dashboard
  • Mid-Right: Katib (AutoML)
  • Bottom-Left: Training Operator
  • Bottom-Right: KServe (Serving)

(You also missed central dashboard, which is a pretty important component).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we store the drawio source file as well so it's easier to modify later

@terrytangyuan If you upload this image to drawio, you can modify it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the Web UI the Central Dashboard ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbottum @andreyvelich perhaps we should re-word the "Web UI" to "Web Dashboard"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyvelich I think we can swap the "layers" to more logically represent what runs "on top" of what.

My idea for ordering (from top to bottom):

  1. Libraries (Jupyter / Tensorflow / Torch / ...)
    • Yellow Horizontal Bar
  2. Interfaces (SDKs / Web Dashboard / Kubectl)
    • Blue Arrow
  3. Kubeflow
    • Blue Arrow
  4. Kubernetes
    • Yellow Horizontal Bar
  5. Platforms (AWS / GCP / On-Prem / ...)

Some other notes:

  • Minikube is not really a platform, so we can remove it (or possibly replace it with "Local Deployment", with a picture of a laptop).
  • We should put a dotted line around the "Platforms" group, like the "Libraries" one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Web UI is the combination of all UI that Kubeflow offers today (e.g. Central Dashboard, Katib, Pipelines, Volumes, Tensorboards, etc.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thesuperzapper I agree that Kubernetes should go above the platforms and I can change minikube to local deployment, but I am not sure if SDK/Web UI/kubectl should be between Kubeflow and libraries.

My thoughts are that interfaces (SDK, Web UI, and kubectl) is a layer between Kubeflow components and Kubernetes. For example, when user run create_job Training Operator SDK API, it creates resources in Kubernetes clusters.
Or when user clicks create Experiment button on the Katib UI it creates custom resources in Kubernetes.

Any thoughts @thesuperzapper @jbottum @johnugeorge @vikas-saxena02 ?

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 9 additions & 2 deletions content/en/docs/started/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,13 @@ recreate other services, but to provide a straightforward way to deploy
best-of-breed open-source systems for ML to diverse infrastructures. Anywhere
you are running Kubernetes, you should be able to run Kubeflow.

The following diagram shows the main Kubeflow components to cover each step of ML lifecycle
on top of Kubernetes.

<img src="/docs/started/images/kubeflow-overview.drawio.png"
alt="Kubeflow overview"
class="mt-3 mb-3">

## Getting started with Kubeflow

Read the [architecture overview](/docs/started/architecture/) for an
Expand All @@ -35,7 +42,7 @@ To use Kubeflow, the basic workflow is:
environment.

You can adapt the configuration to choose the platforms and services that you
want to use for each stage of the ML workflow:
want to use for each stage of the ML workflow:

1. data preparation
2. model training,
Expand Down Expand Up @@ -68,7 +75,7 @@ configure based on the cluster it deploys into.

## History

Kubeflow started as an open sourcing of the way Google ran [TensorFlow](https://www.tensorflow.org/) internally, based on a pipeline called [TensorFlow Extended](https://www.tensorflow.org/tfx/).
Kubeflow started as an open sourcing of the way Google ran [TensorFlow](https://www.tensorflow.org/) internally, based on a pipeline called [TensorFlow Extended](https://www.tensorflow.org/tfx/).
It began as just a simpler way to run TensorFlow jobs on Kubernetes, but has since expanded to be a multi-architecture, multi-cloud framework for running end-to-end machine learning workflows.

## Roadmaps
Expand Down