-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark-3.5: make where
sql case sensitive setting alterable in rewrite data files procedure
#11439
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ludlows can you please also add an UT for it to future proof it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes looks reasonable to me as well, agree with @singhpk234 about UT
Hi @szehon-ho , |
I think I typed the wrong version of iceberg in the issue #11438 |
...ensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java
Outdated
Show resolved
Hide resolved
...ensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java
Show resolved
Hide resolved
@ludlows
|
hi @huaxingao , I think it is reasonable to apply the setting about the sql case sensitivity to all procedures, but we could take this pr as the starting point. |
...k/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java
Outdated
Show resolved
Hide resolved
@ludlows Thanks for the quick fix. Can we have a test that fails without the fix but passes with it? It seems that all your current tests pass even without the fix. |
@huaxingao I think the test method |
@ludlows I think you can simply reproduce the problem by something like
|
@huaxingao thanks for the comment. @TestTemplate
public void testFilterCaseSensitivityAfterChange() {
createTable();
insertData(10);
sql("set spark.sql.caseSensitive=false");
assertEquals(
"Should have done nothing but passed the schema validation, since no files are present",
ImmutableList.of(row(0, 0, 0L, 0)),
sql(
"CALL %s.system.rewrite_data_files(table=>'%s', where=>'C1 > 90000000')",
catalogName, tableIdent));
} the test case above has passed . |
@ludlows Thanks for the quick reply. I know my example will pass with the PR's fix. However, the problem will arise without the fix. We need a simple test that fails without the fix and passes with it. A straightforward test like my example should suffice, with minimal changes. My goal is to keep the test as simple as possible. |
...k/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @ludlows , and also @huaxingao, @anuragmantri @singhpk234 for reviews |
this pr aims to make the rewriteDataFile action is aware of the user settings about sql case sensitivity in the
where
statement.the implementation is simple.
we first obtain the case sensitive setting and save it as a variable in the constructor of rewriteDataFileAction.
then, we pass the variable to the tableScan .
related issue: #11438