Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Support Yaml / JSON input for pyflyte run #3371

Closed
2 tasks done
zeryx opened this issue Feb 24, 2023 · 0 comments
Closed
2 tasks done

[Core feature] Support Yaml / JSON input for pyflyte run #3371

zeryx opened this issue Feb 24, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request flytekit FlyteKit Python related issue needs discussion

Comments

@zeryx
Copy link

zeryx commented Feb 24, 2023

Motivation: Why do you think this is important?

A bunch of ML workflows and training systems take huge amounts of hyper parameters when configuring training jobs. Sometimes on the order of hundreds of separate variables that are needed to define a training run.

Today there does not seem to be a way to provide a file based input for workflow execution. This causes users to either pass a URL as input towards a hosted datafile somewhere, or to manually break out each variable in their configuration and provide that in either the CLI or via the web UI - this is not realistic or tenable for modern ML training projects.

Goal: What should the final outcome look like, ideally?

With pyflyte run, you should be able to provide a structured yaml or json input defining the configuration of your workflow.

Ideally all of those configuration settings would be matched against a user defined hyperparameter class, rather than piecemeal parameters passed in like a conventional python function.

A user should also be able to register their workflow, and be able to provide the yaml file to execute via the UI, and also via a launch plan.

Describe alternatives you've considered

We do support setting some parameters as fixed in launch plans, however that takes control and agency out of the users hands around what actual elements are fixed and variable; and forcing users to generate potentially hundreds of separate launch plans for each individual training event.
Even when setting up Launch Plans, the user must manually hardcode all of the variables that are static between executions, which defeats a lot of the purpose here.

Propose: Link/Inline OR Additional context

https://pypi.org/project/yamldataclassconfig/

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@zeryx zeryx added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Feb 24, 2023
@pingsutw pingsutw added flytekit FlyteKit Python related issue and removed untriaged This issues has not yet been looked at by the Maintainers labels Feb 24, 2023
@kumare3 kumare3 self-assigned this Apr 24, 2023
@kumare3 kumare3 closed this as completed May 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request flytekit FlyteKit Python related issue needs discussion
Projects
None yet
Development

No branches or pull requests

4 participants