Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala 2.13 Support #8592

Merged
merged 130 commits into from
Oct 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
130 commits
Select commit Hold shift + click to select a range
77e83c3
Preliminary Scala 2.13 build support, need to fix compilation errors
NVnavkumar May 19, 2023
7c9a930
Handle 2 simple updates with scalastyle checks
NVnavkumar Jun 12, 2023
5b41add
Update due to requirement of not importing scala.collection.Seq directly
NVnavkumar Jun 12, 2023
02ef278
One round of code updates for Scala 2.13
NVnavkumar Jun 21, 2023
8adb72e
Merge branch 'branch-23.10' into scala-213
NVnavkumar Sep 5, 2023
ef6a97d
Style fixes.
NVnavkumar Sep 5, 2023
a359bec
Update RAPIDS shuffle manager code to parallel use of scala.collectio…
NVnavkumar Sep 6, 2023
65e3601
Handle usage of Seq in GpuBindReferences
NVnavkumar Sep 6, 2023
bc5d337
Sub partition hash join updates. Follow the recommended scala pattern…
NVnavkumar Sep 6, 2023
ce9054e
Handle ArrayStack usage in GpuSortExec and SortUtils
NVnavkumar Sep 6, 2023
272725e
Regex parser fixes.
NVnavkumar Sep 6, 2023
cd7a132
Parquet Scan Seq usage.
NVnavkumar Sep 6, 2023
ac75ac6
Orc scan Seq usage.
NVnavkumar Sep 6, 2023
c5f7bff
Multi file reader Seq usage.
NVnavkumar Sep 6, 2023
4b69ec9
Hive CTAS command fixes
NVnavkumar Sep 6, 2023
4714ca6
miscellaneous scan usage of Seq
NVnavkumar Sep 6, 2023
85c0af9
Fix procedure syntax.
NVnavkumar Sep 6, 2023
144e442
Add procedure declaration checker since it's fully deprecated in Scal…
NVnavkumar Sep 7, 2023
a38a91a
Add scala versions here based on Spark
NVnavkumar Sep 7, 2023
a9bd68e
Merge branch 'branch-23.10' into scala-213
NVnavkumar Sep 8, 2023
8db9a16
Move these files to scala 2.12 only compilation
NVnavkumar Sep 8, 2023
69d7b44
Restore these to the original scala 2.12 compatible versions
NVnavkumar Sep 8, 2023
fec9f0c
Merge updates from upstream sort changes.
NVnavkumar Sep 8, 2023
abf994f
Forking Arm because of collection hierarchy handling
NVnavkumar Sep 11, 2023
577e5a4
Updated forked versions for Scala 2.13
NVnavkumar Sep 12, 2023
130e52c
Updated toSeq usage for Scala 2.13
NVnavkumar Sep 12, 2023
bf24baa
Handle forked scala-2.12 and scala-2.13 directories for build
NVnavkumar Sep 12, 2023
220e0d3
For Scala 2.13 need to convert Set to Seq when using .map(). This is …
NVnavkumar Sep 12, 2023
742e043
Need to fork aggregate.scala because of Product with Serialize supert…
NVnavkumar Sep 12, 2023
6d50c1e
Handle failed implicit SAM conversion for Scala 2.13
NVnavkumar Sep 19, 2023
d993ec9
rewrite to a pure lambda to make it compile in 2.13
NVnavkumar Sep 20, 2023
22e0354
Fix 3 more simple compile errors
NVnavkumar Sep 20, 2023
cfb4535
Add public override of interface method (in this case the default ove…
NVnavkumar Sep 20, 2023
7e2cde2
The enqueue method with a single varargs list was removed in Scala 2.…
NVnavkumar Sep 20, 2023
8c07562
Remove unneeded condition for Scala 2.13
NVnavkumar Sep 21, 2023
ef58914
This needs to take in scala.collection.Seq
NVnavkumar Sep 21, 2023
3d95050
Add @SuppressWarnings annotation for Scala 2.13 support
NVnavkumar Sep 21, 2023
c3467b7
Fix Scala 2.13 compiler errors in shuffle-plugin
NVnavkumar Sep 21, 2023
df0d646
Merge branch 'branch-23.10' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Sep 21, 2023
7f2fe5d
Fix Scala 2.13 compiler issues in udf-compiler
NVnavkumar Sep 21, 2023
e4868bb
Remove duplicate sections
NVnavkumar Sep 21, 2023
b041889
Fix delta-lake and datagen compile issues.
NVnavkumar Sep 21, 2023
4aecd9a
Update scalatest files to compile with Scala 2.13
NVnavkumar Sep 21, 2023
aecfa13
Add all actual Scala 2.13 build versions.
NVnavkumar Sep 22, 2023
8b794ae
Shimming this part of RAPIDS shuffle to handle Spark 3.4+ change tha…
NVnavkumar Sep 22, 2023
66162df
Merge branch 'branch-23.10' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Sep 22, 2023
282f8d3
Merge branch 'branch-23.10' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Sep 25, 2023
2a6b6c6
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Sep 25, 2023
ef7e49b
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Sep 26, 2023
5bd1619
Fork this code to handle differences in ArrayBuffer with Scala 2.13
NVnavkumar Sep 27, 2023
33d3370
Add Comment to clarify script is copied from Apache Spark
NVnavkumar Sep 27, 2023
9363438
Change cast to Seq to using toSeq method
NVnavkumar Sep 27, 2023
7123930
Handle Map properly for Scala 2.13
NVnavkumar Sep 28, 2023
581f022
Fix NPE caused by Scala 2.13 Seq changes.
NVnavkumar Sep 28, 2023
b1ec78a
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Sep 29, 2023
2c62f49
Fix some incorrect versions in pom.xml
NVnavkumar Oct 3, 2023
5b7fe19
Build script updates
NVnavkumar Oct 4, 2023
5fa0c4c
Skip decimal tests in arithmetic_ops_test that fail on Scala 2.13 CPU
NVnavkumar Oct 5, 2023
6d2021c
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 5, 2023
67e52e9
Include Scala213 build profiles in dist pom
NVnavkumar Oct 5, 2023
4c4cc23
Revert this change
NVnavkumar Oct 6, 2023
0973c52
Remove commented out code
NVnavkumar Oct 6, 2023
2499e45
Remove commented out code
NVnavkumar Oct 6, 2023
e2901c0
Address feedback, refactor Arm, and add reference to SPARK JIRA in in…
NVnavkumar Oct 10, 2023
8d42334
Refactor aggregate.scala for 2.12/2.13 difference
NVnavkumar Oct 10, 2023
4c0e448
Refactor GpuSortExec for 2.12/2.13
NVnavkumar Oct 11, 2023
870d19d
Revert 2.13 pom changes
NVnavkumar Oct 11, 2023
f8866f9
Refactor GpuSorter out of SortUtils.scala to handle 2.13/2.12 differe…
NVnavkumar Oct 11, 2023
85c8fa4
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 13, 2023
6aab915
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 16, 2023
06a558f
Fix scalastyle issue with importing SeqLike directly
NVnavkumar Oct 16, 2023
46c414d
Fix some style issues.
NVnavkumar Oct 17, 2023
d49619e
Scala 2.13 build using separate directory
NVnavkumar Oct 17, 2023
5c6f672
Remove the change-scala-version.sh script.
NVnavkumar Oct 17, 2023
44669c9
Support delta lake source import for Scala 2.13
NVnavkumar Oct 17, 2023
3cec238
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 17, 2023
9f41a3a
Cleanup Delta Lake source import for Scala 2.13
NVnavkumar Oct 17, 2023
a3ef04d
Fix some pom files for scala 2.13
NVnavkumar Oct 17, 2023
919ac68
Make build/buildall work with Scala 2.13 to make proper plugin JAR
NVnavkumar Oct 17, 2023
58a9632
Scala 2.13 documentation
NVnavkumar Oct 18, 2023
865d6b8
Add required profile for scala 2.13 versions.
NVnavkumar Oct 18, 2023
95171ec
Adding pre-merge GitHub workflow for Scala 2.13
NVnavkumar Oct 18, 2023
53cc51c
this shouldn't skip the first version
NVnavkumar Oct 18, 2023
9bcb289
Fix Scala version on artifactId on these poms
NVnavkumar Oct 18, 2023
0b93bba
Fix scala versions in github pre-merge hook
NVnavkumar Oct 18, 2023
7e34de0
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 18, 2023
1ae14cb
Shim update for CDH 3.3.2
NVnavkumar Oct 18, 2023
924fe93
Another CDH 3.3.2 shim fix
NVnavkumar Oct 18, 2023
4319799
Fix bug in profile name
NVnavkumar Oct 18, 2023
fd3407e
Shim update for CDH 3.3.2 in tests
NVnavkumar Oct 18, 2023
fbe7838
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 19, 2023
6c1c724
Fix premerge line for github scala build files check
NVnavkumar Oct 19, 2023
83680b3
Update clean task to avoid 2.12 pom cleaning out Scala 2.13 files
NVnavkumar Oct 19, 2023
56accae
Handle integration tests with resources
NVnavkumar Oct 20, 2023
f467272
Add premerge tests for Scala 2.13
NVnavkumar Oct 20, 2023
f17743a
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 20, 2023
9c90e17
Fix premerge script syntax error
NVnavkumar Oct 20, 2023
6d671bd
Fix premerge script syntax error 2
NVnavkumar Oct 20, 2023
1f3c5ad
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 20, 2023
1457ad9
Fix for Databricks refactors that were not merged in properly before
NVnavkumar Oct 20, 2023
fcde3eb
Merge branch 'scala-213' of github.com:NVnavkumar/spark-rapids into s…
NVnavkumar Oct 20, 2023
046e9b5
Add log message for Scala version
NVnavkumar Oct 20, 2023
7011a5a
Fix premerge profile for Scala 2.13
NVnavkumar Oct 23, 2023
6279a06
Fix Jenkins MVN_URM_MIRROR
NVnavkumar Oct 23, 2023
0892562
Need these to match up properly
NVnavkumar Oct 23, 2023
82456ae
Revert this change
NVnavkumar Oct 23, 2023
096d775
Create symlink when handling relative jenkins path
NVnavkumar Oct 23, 2023
ab25ace
Fix symlink command
NVnavkumar Oct 23, 2023
835e29b
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 23, 2023
be09252
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 23, 2023
b8b6463
Fix toSeq issue in GpuSemaphore
NVnavkumar Oct 23, 2023
6277529
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 24, 2023
e7301f6
Update noSnapshots profile in Scala2.13 for CI purposes.
NVnavkumar Oct 24, 2023
4c359e7
Update premerge to run Integration tests for Scala 2.13
NVnavkumar Oct 24, 2023
c138485
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 24, 2023
92dbb00
More pom file updates from upstream
NVnavkumar Oct 24, 2023
f52907c
Update premerge build script to separate out 2.13 tasks.
NVnavkumar Oct 25, 2023
9fe0e49
Fixes for integration tests based on feedback
NVnavkumar Oct 25, 2023
db33a80
Merge branch 'branch-23.12' of github.com:NVIDIA/spark-rapids into sc…
NVnavkumar Oct 25, 2023
ffb45ba
Updates for pre-merge CI from Tim
NVnavkumar Oct 25, 2023
4ed4815
signing off
NVnavkumar Oct 25, 2023
c88320b
Remove checking scala2.13 folder
NvTimLiu Oct 26, 2023
5db5520
fix typo
NvTimLiu Oct 26, 2023
eb2dbd3
Refactor GpuSorter further to keep it in SortUtils and just shim out …
NVnavkumar Oct 26, 2023
b375ba3
remove commented out code and add comment for clarification
NVnavkumar Oct 26, 2023
e6409c8
Fix issue with copyrights
NVnavkumar Oct 26, 2023
b1e774c
use asIsntanceOf[...] to resolve this Scala 2.13/2.12 discrepancy
NVnavkumar Oct 26, 2023
bbb6750
Fix newline end of file
NVnavkumar Oct 26, 2023
4a41342
Update documentation
NVnavkumar Oct 26, 2023
e0ff07c
Update documentation 2
NVnavkumar Oct 26, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions .github/workflows/mvn-verify-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ jobs:
defaultSparkVersion: ${{ steps.allShimVersionsStep.outputs.defaultSparkVersion }}
sparkTailVersions: ${{ steps.allShimVersionsStep.outputs.tailVersions }}
sparkJDKVersions: ${{ steps.allShimVersionsStep.outputs.jdkVersions }}
scala213Versions: ${{ steps.allShimVersionsStep.outputs.scala213Versions }}
steps:
- uses: actions/checkout@v3 # refs/pull/:prNumber/merge

Expand Down Expand Up @@ -78,6 +79,23 @@ jobs:
jdkVersionJsonStr=$(printf {\"include\":[%s]} $jdkVersionArrBody)
echo "jdkVersions=$jdkVersionJsonStr" >> $GITHUB_OUTPUT

SCALA_BINARY_VER=2.13
. jenkins/version-def.sh
svArrBodyNoSnapshot=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":false}" "${SPARK_SHIM_VERSIONS_NOSNAPSHOTS[@]}")
svArrBodyNoSnapshot=${svArrBodyNoSnapshot:1}
# get private artifact version
privateVer=$(mvn help:evaluate -q -pl dist -Dexpression=spark-rapids-private.version -DforceStdout)
Copy link
Collaborator

@NvTimLiu NvTimLiu Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we get private artifact version under scala2.13 dir though the version is the same?

# do not add empty snapshot versions or when private version is released one (does not include snapshot shims)
if [[ ${#SPARK_SHIM_VERSIONS_SNAPSHOTS_ONLY[@]} -gt 0 && $privateVer == *"-SNAPSHOT" ]]; then
svArrBodySnapshot=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":true}" "${SPARK_SHIM_VERSIONS_SNAPSHOTS_ONLY[@]}")
svArrBodySnapshot=${svArrBodySnapshot:1}
svJsonStr=$(printf {\"include\":[%s]} $svArrBodyNoSnapshot,$svArrBodySnapshot)
else
svJsonStr=$(printf {\"include\":[%s]} $svArrBodyNoSnapshot)
fi

echo "scala213Versions=$svJsonStr" >> $GITHUB_OUTPUT

package-tests:
needs: get-shim-versions-from-dist
continue-on-error: ${{ matrix.isSnapshot }}
Expand Down Expand Up @@ -115,6 +133,53 @@ jobs:
-Drat.skip=true \
${{ env.COMMON_MVN_FLAGS }}

package-tests-scala213:
needs: get-shim-versions-from-dist
continue-on-error: ${{ matrix.isSnapshot }}
strategy:
matrix: ${{ fromJSON(needs.get-shim-versions-from-dist.outputs.scala213Versions) }}
fail-fast: false
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3 # refs/pull/:prNumber/merge

- name: Setup Java and Maven Env
uses: actions/setup-java@v3
with:
distribution: adopt
java-version: 8

- name: check runtime before tests
run: |
env | grep JAVA
java -version && mvn --version && echo "ENV JAVA_HOME: $JAVA_HOME, PATH: $PATH"

- name: package tests check
run: |
# https://github.com/NVIDIA/spark-rapids/issues/8847
# specify expected versions
export JAVA_HOME=${JAVA_HOME_8_X64}
export PATH=${JAVA_HOME}/bin:${PATH}
java -version && mvn --version && echo "ENV JAVA_HOME: $JAVA_HOME, PATH: $PATH"
# verify Scala 2.13 build files
./build/make-scala-version-build-files.sh 2.13
# verify git status
if ! git diff --exit-code 'scala2.13/*'; then
echo "Generated Scala 2.13 build files don't match what's in repository"
exit 1
fi
# change to Scala 2.13 Directory
cd scala2.13
# test command
mvn -Dmaven.wagon.http.retryHandler.count=3 -B package \
-pl integration_tests,tests -am \
-P 'individual,pre-merge' \
-Dbuildver=${{ matrix.spark-version }} \
-Dmaven.scalastyle.skip=true \
-Drat.skip=true \
${{ env.COMMON_MVN_FLAGS }}


verify-all-modules:
needs: get-shim-versions-from-dist
runs-on: ubuntu-latest
Expand Down
17 changes: 17 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,23 @@ You can find all available build versions in the top level pom.xml file. If you
for Databricks then you should use the `jenkins/databricks/build.sh` script and modify it for
the version you want.

Note that we build against both Scala 2.12 and 2.13. Any contribution you make to the
codebase should compile with both Scala 2.12 and 2.13 for Apache Spark versions 3.3.0 and
higher.

Also, if you make changes in the parent `pom.xml` or any other of the module `pom.xml`
files, you must run the following command to sync the changes between the Scala 2.12 and
2.13 pom files:

```shell script
./build/make-scala-version-build-files.sh 2.13
```

That way any new dependencies or other changes will also be picked up in the Scala 2.13 build.

See the [scala2.13](scala2.13) directory for more information on how to build against
Scala 2.13.

To get an uber jar with more than 1 version you have to `mvn package` each version
and then use one of the defined profiles in the dist module, or a comma-separated list of
build versions. See the next section for more details.
Expand Down
9 changes: 8 additions & 1 deletion aggregator/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,17 @@

<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<artifactId>rapids-4-spark-parent_2.12</artifactId>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit/question: is relativePath necessary when we already differentiate parent artifactId _2.12 and _2.13?

</parent>
<artifactId>rapids-4-spark-aggregator_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark Aggregator</name>
<description>Creates an aggregated shaded package of the RAPIDS plugin for Apache Spark</description>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.module>aggregator</rapids.module>
<!--
we store ASM-relocated packages in /spark3xx parallel worlds in dist
and they are auto-deduped using binary diff
Expand Down Expand Up @@ -172,7 +174,9 @@
<profile>
<id>release311</id>
<activation>
<!-- #if scala-2.12 -->
<activeByDefault>true</activeByDefault>
<!-- #endif scala-2.12 -->
<property>
<name>buildver</name>
<value>311</value>
Expand Down Expand Up @@ -360,6 +364,9 @@
<profile>
<id>release330</id>
<activation>
<!-- #if scala-2.13 --><!--
<activeByDefault>true</activeByDefault>
--><!-- #endif scala-2.13 -->
<property>
<name>buildver</name>
<value>330</value>
Expand Down
9 changes: 7 additions & 2 deletions api_validation/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,17 @@

<parent>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-parent</artifactId>
<artifactId>rapids-4-spark-parent_2.12</artifactId>
<version>23.12.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>
<artifactId>rapids-4-spark-api-validation</artifactId>
<artifactId>rapids-4-spark-api-validation_2.12</artifactId>
<version>23.12.0-SNAPSHOT</version>

<properties>
<rapids.module>api_validation</rapids.module>
</properties>

<profiles>
<profile>
<id>default</id>
Expand Down
38 changes: 33 additions & 5 deletions build/buildall
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ shopt -s extglob

SKIP_CLEAN=1
BUILD_ALL_DEBUG=0
SCALA213=0

function print_usage() {
echo "Usage: buildall [OPTION]"
Expand All @@ -35,10 +36,10 @@ function print_usage() {
echo " generate projects for Bloop clients: IDE (Scala Metals, IntelliJ) or Bloop CLI"
echo " -p=DIST_PROFILE, --profile=DIST_PROFILE"
echo " use this profile for the dist module, default: noSnapshots, also supported: snapshots, minimumFeatureVersionMix,"
echo " snapshotsWithDatabricks, and noSnapshotsWithDatabricks. NOTE: the Databricks-related spark3XYdb shims"
echo " are not built locally, the jars are fetched prebuilt from a remote Maven repo."
echo " You can also supply a comma-separated list of build versions. E.g., --profile=320,330 will build only"
echo " the distribution jar only for 3.2.0 and 3.3.0"
echo " snapshotsWithDatabricks, noSnapshotsWithDatabricks, noSnapshotsScala213, snapshotsScala213."
echo " NOTE: the Databricks-related spark3XYdb shims are not built locally, the jars are fetched prebuilt from a"
echo " . remote Maven repo. You can also supply a comma-separated list of build versions. E.g., --profile=320,330 will"
echo " build only the distribution jar only for 3.2.0 and 3.3.0"
echo " -m=MODULE, --module=MODULE"
echo " after finishing parallel builds, resume from dist and build up to and including module MODULE."
echo " E.g., --module=integration_tests"
Expand Down Expand Up @@ -135,6 +136,10 @@ case "$1" in
FINAL_OP="install"
;;

--scala213)
SCALA213=1
;;

--rebuild-dist-only)
SKIP_DIST_DEPS="1"
MODULE="dist"
Expand All @@ -157,14 +162,33 @@ shift

done

if [[ "$DIST_PROFILE" == *Scala213 ]]; then
SCALA213=1
fi


# include options to mvn command
export MVN="mvn -Dmaven.wagon.http.retryHandler.count=3 ${MVN_OPT}"

DIST_PROFILE=${DIST_PROFILE:-"noSnapshots"}
if [[ "$SCALA213" == "1" ]]; then
DIST_PROFILE=${DIST_PROFILE:-"noSnapshotsScala213"}
$(dirname $0)/make-scala-version-build-files.sh 2.13
else
DIST_PROFILE=${DIST_PROFILE:-"noSnapshots"}
fi

[[ "$MODULE" != "" ]] && MODULE_OPT="--projects $MODULE --also-make" || MODULE_OPT=""

case $DIST_PROFILE in

snapshotsScala213)
SPARK_SHIM_VERSIONS=($(versionsFromDistProfile "snapshotsScala213"))
;;

noSnapshotsScala213)
SPARK_SHIM_VERSIONS=($(versionsFromDistProfile "noSnapshotsScala213"))
;;

snapshots?(WithDatabricks))
SPARK_SHIM_VERSIONS=($(versionsFromDistProfile "snapshots"))
;;
Expand Down Expand Up @@ -209,6 +233,10 @@ if [[ "$SKIP_CLEAN" != "1" ]]; then
$MVN -q clean
fi

if [[ "$SCALA213" == "1" ]]; then
cd scala2.13
fi

echo "Building a combined dist jar with Shims for ${SPARK_SHIM_VERSIONS[@]} ..."

function build_single_shim() {
Expand Down
97 changes: 97 additions & 0 deletions build/make-scala-version-build-files.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
#!/usr/bin/env bash
#
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#


set -e

VALID_VERSIONS=( 2.13 )
declare -A DEFAULT_SPARK
DEFAULT_SPARK[2.12]="spark311"
DEFAULT_SPARK[2.13]="spark330"

usage() {
echo "Usage: $(basename $0) [-h|--help] <version>
where :
-h| --help Display this help text
valid version values : ${VALID_VERSIONS[*]}
" 1>&2
exit 1
}

if [[ ($# -ne 1) || ( $1 == "--help") || $1 == "-h" ]]; then
usage
fi

TO_VERSION=$1

check_scala_version() {
for i in ${VALID_VERSIONS[*]}; do [ $i = "$1" ] && return 0; done
echo "Invalid Scala version: $1. Valid versions: ${VALID_VERSIONS[*]}" 1>&2
exit 1
}

check_scala_version "$TO_VERSION"

sed_i() {
sed -e "$1" "$2" > "$2.tmp" && mv "$2.tmp" "$2"
}

BASEDIR=$(dirname $0)/..
if [ $TO_VERSION = "2.13" ]; then
FROM_VERSION="2.12"
SUBDIR="scala2.13"
fi

TO_DIR="$BASEDIR/$SUBDIR"
mkdir -p $TO_DIR

for f in $(git ls-files '**pom.xml'); do
if [[ $f == $SUBDIR* ]]; then
echo "Skipping $f"
continue
fi
echo $f
tof="$TO_DIR/$f"
mkdir -p $(dirname $tof)
cp -f $f $tof
echo $tof
sed_i 's/\(artifactId.*\)_'$FROM_VERSION'/\1_'$TO_VERSION'/g' $tof
sed_i 's/^\([[:space:]]*<!-- #if scala-'$TO_VERSION' -->\)<!--/\1/' $tof
sed_i 's/^\([[:space:]]*\)-->\(<!-- #endif scala-'$TO_VERSION' -->\)/\1\2/' $tof
sed_i 's/^\([[:space:]]*<!-- #if scala-'$FROM_VERSION' -->\)$/\1<!--/' $tof
sed_i 's/^\([[:space:]]*\)\(<!-- #endif scala-'$FROM_VERSION' -->\)/\1-->\2/' $tof
done

# Update spark.version to spark330.version for Scala 2.13
SPARK_VERSION=${DEFAULT_SPARK[$TO_VERSION]}
sed_i '/<java\.major\.version>/,/<spark\.version>\${spark[0-9]\+\.version}</s/<spark\.version>\${spark[0-9]\+\.version}</<spark.version>\${'$SPARK_VERSION'.version}</' \
"$TO_DIR/pom.xml"

sed_i '/<java\.major\.version>/,/<spark\.version>\${spark[0-9]\+\.version}</s/<spark\.version>\${spark[0-9]\+\.version}</<spark.version>\${'$SPARK_VERSION'.version}</' \
"$TO_DIR/pom.xml"

# Update <scala.binary.version> in parent POM
# Match any scala binary version to ensure idempotency
sed_i '/<spark\-rapids\-jni\.version>/,/<scala\.binary\.version>[0-9]*\.[0-9]*</s/<scala\.binary\.version>[0-9]*\.[0-9]*</<scala.binary.version>'$TO_VERSION'</' \
"$TO_DIR/pom.xml"


# Update <scala.version> in parent POM
# Match any scala version to ensure idempotency
SCALA_VERSION=$(mvn help:evaluate -Pscala-${TO_VERSION} -Dexpression=scala.version -q -DforceStdout)
sed_i '/<spark\-rapids\-jni\.version>/,/<scala.version>[0-9]*\.[0-9]*\.[0-9]*</s/<scala\.version>[0-9]*\.[0-9]*\.[0-9]*</<scala.version>'$SCALA_VERSION'</' \
"$TO_DIR/pom.xml"
10 changes: 7 additions & 3 deletions build/shimplify.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ def __csv_as_arr(str_val):
else:
return str_val.translate(None, ' ' + os.linesep).split(',')

__src_basedir = __ant_proj_prop('shimplify.src.basedir') or str(__project().getBaseDir())

__should_add_comment = __is_enabled_attr('if')

Expand Down Expand Up @@ -301,6 +302,7 @@ def task_impl():
"""Ant task entry point """
__log.info('# Starting Jython Task Shimplify #')
config_format = """# config:
# shimplify.src.baseDir=%s
# shimplify (if)=%s
# shimplify.add.base=%s
# shimplify.add.shim=%s
Expand All @@ -310,6 +312,7 @@ def task_impl():
# shimplify.shims=%s
# shimplify.trace=%s"""
__log.info(config_format,
__src_basedir,
__should_add_comment,
__add_shim_base,
__add_shim_buildver,
Expand Down Expand Up @@ -370,7 +373,7 @@ def __generate_symlinks():

def __traverse_source_tree_of_all_shims(src_type, func):
"""Walks src/<src_type>/sparkXYZ"""
base_dir = str(__project().getBaseDir())
base_dir = __src_basedir
src_root = os.path.join(base_dir, 'src', src_type)
for dir, subdirs, files in os.walk(src_root, topdown=True):
if dir == src_root:
Expand All @@ -396,9 +399,10 @@ def __traverse_source_tree_of_all_shims(src_type, func):

def __generate_symlink_to_file(buildver, src_type, shim_file_path, build_ver_arr):
if buildver in build_ver_arr:
base_dir = str(__project().getBaseDir())
project_base_dir = str(__project().getBaseDir())
base_dir = __src_basedir
src_root = os.path.join(base_dir, 'src', src_type)
target_root = os.path.join(base_dir, 'target', "spark%s" % buildver, 'generated', 'src',
target_root = os.path.join(project_base_dir, 'target', "spark%s" % buildver, 'generated', 'src',
src_type)
first_build_ver = build_ver_arr[0]
__log.debug("top shim comment %s", first_build_ver)
Expand Down
Loading