Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] A Universal ML command in PPL for train/predict/trainandpredict #849

Closed
jngz-es opened this issue Sep 20, 2022 · 4 comments
Closed
Labels
feature PPL Piped processing language RFC Request For Comments v2.4.0 'Issues and PRs related to version v2.4.0'

Comments

@jngz-es
Copy link
Contributor

jngz-es commented Sep 20, 2022

Is your feature request related to a problem?
Currently for each new algorithm in ml-commons, we have to add a new command in PPL which means we have to implement an entire PPL command process including syntax parser, logical plan and physical plan. It is very inefficient for development.
From user interface perspective, it is more clean and reasonable to have one command for all algorithms than each command for each algorithm.

What solution would you like?
We want to provide only one PPL command (ml) for all algorithms in ml-commons about train/predict/trainandpredict. So for new algorithm launch, we just need to add some changes in ml-commons part, don't need to touch PPL plugin part any more.

What alternatives have you considered?
We considered add 3 commands each for train, predict and trainandpredict in terms of ml-commons APIs.

Do you have any additional context?
We plan to keep the existing algorithms PPL commands at this moment, but want to deprecate them in the future.

Example
ml action=train algo=kmeans centroids=3 iterations=2 distance_type='cosine'

@jngz-es jngz-es added enhancement New feature or request untriaged labels Sep 20, 2022
@dai-chen dai-chen added PPL Piped processing language RFC Request For Comments and removed untriaged labels Sep 20, 2022
@dai-chen
Copy link
Collaborator

@jngz-es I've marked this as RFC. If needed, you can put some query examples for community feedback. Thanks!

@jngz-es
Copy link
Contributor Author

jngz-es commented Sep 20, 2022

@jngz-es I've marked this as RFC. If needed, you can put some query examples for community feedback. Thanks!

Added an example query, thanks!

@ahopp
Copy link

ahopp commented Oct 26, 2022

Looks like I'm very late on this, but I think its important to provide justification on why this feature is being developed/have been developed in PPL but not in parity SQL. I assume you all chose PPL because it was easier and/or because you all needed it downstream (e.g., PPL in observability) but I think it's important to share this justification with the community given the adoption (i.e., SQL versus PPL). I realize it is used heavily in the observability plugin experience and if that's the justification, we should be clear.

@dai-chen dai-chen added the v2.4.0 'Issues and PRs related to version v2.4.0' label Oct 31, 2022
@anirudha anirudha added v2.4.0 'Issues and PRs related to version v2.4.0' and removed v2.4.0 'Issues and PRs related to version v2.4.0' labels Nov 2, 2022
@dai-chen
Copy link
Collaborator

dai-chen commented Nov 7, 2022

I assume we can close this. If anything else, please open new issue labeled with future release version. Thanks!

@dai-chen dai-chen closed this as completed Nov 7, 2022
@dai-chen dai-chen added feature and removed enhancement New feature or request labels Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature PPL Piped processing language RFC Request For Comments v2.4.0 'Issues and PRs related to version v2.4.0'
Projects
None yet
Development

No branches or pull requests

4 participants