[Core feature] Support Yaml / JSON input for pyflyte run #3371

zeryx · 2023-02-24T17:54:34Z

Motivation: Why do you think this is important?

A bunch of ML workflows and training systems take huge amounts of hyper parameters when configuring training jobs. Sometimes on the order of hundreds of separate variables that are needed to define a training run.

Today there does not seem to be a way to provide a file based input for workflow execution. This causes users to either pass a URL as input towards a hosted datafile somewhere, or to manually break out each variable in their configuration and provide that in either the CLI or via the web UI - this is not realistic or tenable for modern ML training projects.

Goal: What should the final outcome look like, ideally?

With pyflyte run, you should be able to provide a structured yaml or json input defining the configuration of your workflow.

Ideally all of those configuration settings would be matched against a user defined hyperparameter class, rather than piecemeal parameters passed in like a conventional python function.

A user should also be able to register their workflow, and be able to provide the yaml file to execute via the UI, and also via a launch plan.

Describe alternatives you've considered

We do support setting some parameters as fixed in launch plans, however that takes control and agency out of the users hands around what actual elements are fixed and variable; and forcing users to generate potentially hundreds of separate launch plans for each individual training event.
Even when setting up Launch Plans, the user must manually hardcode all of the variables that are static between executions, which defeats a lot of the purpose here.

Propose: Link/Inline OR Additional context

https://pypi.org/project/yamldataclassconfig/

Are you sure this issue hasn't been raised already?

Yes

Have you read the Code of Conduct?

Yes

zeryx added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Feb 24, 2023

pingsutw added flytekit FlyteKit Python related issue and removed untriaged This issues has not yet been looked at by the Maintainers labels Feb 24, 2023

eapolinario added the needs discussion label Mar 3, 2023

kumare3 mentioned this issue Apr 24, 2023

pyflyte run now supports json/yaml files flyteorg/flytekit#1606

Merged

8 tasks

kumare3 self-assigned this Apr 24, 2023

kumare3 closed this as completed May 20, 2023

runllm bot mentioned this issue May 15, 2024

[Core feature] pyflyte run should support a simple json/yaml as input for all parameters #5365

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core feature] Support Yaml / JSON input for pyflyte run #3371

[Core feature] Support Yaml / JSON input for pyflyte run #3371

zeryx commented Feb 24, 2023

[Core feature] Support Yaml / JSON input for pyflyte run #3371

[Core feature] Support Yaml / JSON input for pyflyte run #3371

Comments

zeryx commented Feb 24, 2023

Motivation: Why do you think this is important?

Goal: What should the final outcome look like, ideally?

Describe alternatives you've considered

Propose: Link/Inline OR Additional context

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?