-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A Generic ML Command in PPL #971
Conversation
Signed-off-by: Jing Zhang <[email protected]>
Signed-off-by: Jing Zhang <[email protected]>
Codecov Report
@@ Coverage Diff @@
## 2.x #971 +/- ##
=============================================
- Coverage 97.60% 62.76% -34.85%
=============================================
Files 308 10 -298
Lines 7983 658 -7325
Branches 520 119 -401
=============================================
- Hits 7792 413 -7379
- Misses 190 192 +2
- Partials 1 53 +52
Flags with carried forward coverage won't be shown. Click here to find out more. Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running git checkout sql-jdbc/docs/img/tableau_graph.PNG restore deleted file. reference. #865 (comment)
@@ -411,6 +412,20 @@ public UnresolvedPlan visitAdCommand(AdCommandContext ctx) { | |||
return new AD(builder.build()); | |||
} | |||
|
|||
/** | |||
* Kmeans command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, will correct it to ml.
@@ -148,6 +148,14 @@ adParameter | |||
| (ANOMALY_SCORE_THRESHOLD EQUAL anomaly_score_threshold=decimalLiteral) | |||
; | |||
|
|||
mlCommand |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does ml also no arguments? e.g. source = index | ml. The syntax actaully allow this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no action or algorithm arguments, it will throw exception. I don't think we have to restrict the command content at command parsing stage, we leave all parameters validation to ml-client.
@Getter | ||
@ToString | ||
@EqualsAndHashCode(callSuper = true) | ||
public class LogicalML extends LogicalPlan { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan deprecated Kmeans/AD in Logical plan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we will deprecate them in the following version like 2.5, then remove them in 3.0.
public Map<String, ExprCoreType> getOutputSchema(TypeEnvironment env) { | ||
switch (getAction()) { | ||
case TRAIN: | ||
env.clearAllFields(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why clean all fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ml train command will return only model/task id, and status, so remove all fields from input fields.
|
||
LogicalPlan actual = analyze(AstDSL.project( | ||
new ML(AstDSL.relation("schema"), argumentMap), AstDSL.allFields())); | ||
assertTrue(((LogicalProject) actual).getProjectList().size() >= 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why >= 2, does the result non deterministic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is deterministic, just at least 2.
* @param nodeClient node client | ||
* @return ml-commons result | ||
*/ | ||
protected MLOutput getMLOutput(DataFrame inputDataFrame, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could also deprecated getMLPredictionResult if it is not required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is required for now, we can remove it when removing kmeans and ad commands.
*/ | ||
@RequiredArgsConstructor | ||
@EqualsAndHashCode(callSuper = false) | ||
public class MLOperator extends MLCommonsOperatorActions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is difference with MLCommonsOperator, deptecated MLCommonsOperator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MLOperator is totally new operator for all ml algorithms from ml-commons. I don't want to mix it with old operator as there is a big gap between them, and also for smoothly deprecating old operator in the future.
Didn' t realized it, will recover it. |
Signed-off-by: Jing Zhang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change!
Please add ML commands related doc. And add deprecated describtion for AD and Kmeans command.
* Add generic ml command in ppl. Signed-off-by: Jing Zhang <[email protected]> * Recover ml client dependency. Signed-off-by: Jing Zhang <[email protected]> * Address the comments I. Signed-off-by: Jing Zhang <[email protected]> Signed-off-by: Jing Zhang <[email protected]> (cherry picked from commit c6b234c)
* Address the comments I. Signed-off-by: Jing Zhang <[email protected]> Signed-off-by: Jing Zhang <[email protected]> (cherry picked from commit c6b234c) Co-authored-by: Jing Zhang <[email protected]>
Description
A generic ml command in ppl to apply algorithms in ml-commons.
Issues Resolved
#849
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.