Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix string comparisons. #8253

Merged
merged 3 commits into from
Feb 28, 2022
Merged

Conversation

amrishlal
Copy link
Contributor

Description

A query such as SELECT count(*) FROM mytable WHERE trim(OriginState) = DestState currently throws exception, because internally the predicate gets rewritten to MINUS(trim(OriginState), DestState) = 0 and MINUS function fails to evaluate STRING arguments. This PR modifies StringPredicateFilterOptizer.java to properly rewrite predicate to use STRCMP instead of MINUS (STRCMP(trim(OriginState), DestState) = 0).

Note that the same issue was fixed for string columns in #7394. This PR is extending the original fix to include string functions as well.

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

  • Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed
  • Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

Documentation

@codecov-commenter
Copy link

codecov-commenter commented Feb 27, 2022

Codecov Report

Merging #8253 (16f8871) into master (0ac5f1f) will increase coverage by 0.00%.
The diff coverage is 75.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #8253   +/-   ##
=========================================
  Coverage     70.69%   70.69%           
- Complexity     4238     4242    +4     
=========================================
  Files          1629     1631    +2     
  Lines         85215    85290   +75     
  Branches      12830    12850   +20     
=========================================
+ Hits          60243    60298   +55     
- Misses        20814    20832   +18     
- Partials       4158     4160    +2     
Flag Coverage Δ
integration1 28.81% <75.00%> (+0.06%) ⬆️
integration2 27.56% <75.00%> (+0.05%) ⬆️
unittests1 66.99% <68.75%> (+0.04%) ⬆️
unittests2 14.09% <0.00%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...izer/statement/StringPredicateFilterOptimizer.java 91.48% <75.00%> (-0.62%) ⬇️
...pinot/core/query/request/context/TimerContext.java 91.66% <0.00%> (-4.17%) ⬇️
...core/query/executor/ServerQueryExecutorV1Impl.java 83.02% <0.00%> (-4.13%) ⬇️
...verter/stats/MutableNoDictionaryColStatistics.java 40.90% <0.00%> (-4.10%) ⬇️
...gregation/function/StUnionAggregationFunction.java 73.52% <0.00%> (-2.95%) ⬇️
.../loader/defaultcolumn/DefaultColumnStatistics.java 73.07% <0.00%> (-2.93%) ⬇️
...ava/org/apache/pinot/core/plan/FilterPlanNode.java 87.85% <0.00%> (-1.87%) ⬇️
...controller/helix/core/minion/PinotTaskManager.java 67.51% <0.00%> (-1.83%) ⬇️
...ltime/converter/stats/MutableColumnStatistics.java 53.22% <0.00%> (-1.78%) ⬇️
...apache/pinot/controller/api/upload/ZKOperator.java 74.80% <0.00%> (-1.58%) ⬇️
... and 33 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0ac5f1f...16f8871. Read the comment docs.

Comment on lines 130 to 138
private static Set<String> getStringOutputFunctionList() {
Set<String> set = new HashSet<>();
Method[] methods = StringFunctions.class.getDeclaredMethods();
for (Method method : methods) {
if (method.getReturnType() == String.class) {
set.add(method.getName().toUpperCase());
}
}
return set;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ScalarFunction now has a names array parameter - so you need to read the names from the annotation to do this correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't using @ScalarFunction return names of all scalar functions and how would the output type of the function be resolved? We just want String scalar functions and that too only the once that return a string as output. All the string scalar functions are declared in StringFunctions and hence this code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, allow me to clarify:

    Set<String> set = new HashSet<>();
    Method[] methods = StringFunctions.class.getDeclaredMethods();
    for (Method method : methods) {
      if (method.getReturnType() == String.class) {
        if (method.isAnnotationPresent(ScalarFunction.class)) {
           ScalarFunction annotation = method.getAnnotation(ScalarFunction.class);
           for (String name : annotation.names()) {
             set.add(name.toUpperCase());
           }   
        }
        set.add(method.getName().toUpperCase());
      }
    }
    return set;

This way your logic works even if there is an alias in use. Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see. Thanks for clarifying.

Copy link
Member

@richardstartin richardstartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the binary file added by accident.

@amrishlal
Copy link
Contributor Author

Please remove the binary file added by accident.

Done

@amrishlal amrishlal closed this Feb 28, 2022
@amrishlal amrishlal reopened this Feb 28, 2022
Copy link
Member

@richardstartin richardstartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@siddharthteotia siddharthteotia merged commit c77c2c3 into apache:master Feb 28, 2022
@amrishlal amrishlal deleted the fix-string-comparison branch March 31, 2022 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants