You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Steps/Code to reproduce bug
Running one of the benchmarks ./spark-submit-template power_run_gpu.template nds_power.py parquet_sf3k ./query_streams/query_0.sql time_gpu.csv from https://github.com/NVIDIA/spark-rapids-benchmarks/tree/dev/nds
Getting a java.lang.NoSuchMethod exception:
====== Run query96 ======
Traceback (most recent call last):
File "/home/volzok/Src/spark-rapids-benchmarks/nds/nds_power.py", line 376, in
run_query_stream(args.input_prefix,
File "/home/volzok/Src/spark-rapids-benchmarks/nds/nds_power.py", line 257, in run_query_stream
summary = q_report.report_on(run_one_query,spark_session,
File "/home/volzok/Src/spark-rapids-benchmarks/nds/PysparkBenchReport.py", line 79, in report_on
listener.register()
File "/home/volzok/Src/spark-rapids-benchmarks/nds/python_listener/PythonListener.py", line 27, in register
self.uuid = manager.register(self)
File "/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in call
File "/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py", line 179, in deco
File "/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:com.nvidia.spark.rapids.listener.Manager.register.
: java.lang.NoSuchMethodError: scala.collection.immutable.Map$.apply(Lscala/collection/Seq;)Lscala/collection/GenMap;
at com.nvidia.spark.rapids.listener.Manager$.(Manager.scala:8)
at com.nvidia.spark.rapids.listener.Manager$.(Manager.scala)
at com.nvidia.spark.rapids.listener.Manager.register(Manager.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:750)
Expected behavior
Class found, normal termination, the same result as for a CPU run (runs)
Environment details (please complete the following information)
Spark local execution, here is power_run_gpu.template:
Also please remove all other jars like NDS_LISTENER_JAR and anything other then SPARK_RAPIDS_PLUGIN_JAR from --jars just to see if that gets rid of the error. Just try starting like spark-shell and running something basic. Is your NDS listener jar built for scala 2.13?
Yes, the problem was the nds-benchmark-listener-1.0-SNAPSHOT.jar. It does have an explicit Scala-2.12 dependency. Removing it from the classpath fixes the problem.
Describe the bug
Downloaded the latest spark-3.5.0-bin-hadoop3-scala2.13.tgz and built rapids-4-spark with Spark 2.13 according to https://github.com/NVIDIA/spark-rapids/tree/branch-23.12/scala2.13
./build/buildall --profile=350 --scala213
. Deployed rapids-4-spark_2.13-23.12.0-SNAPSHOT-cuda11.jarSteps/Code to reproduce bug
Running one of the benchmarks
./spark-submit-template power_run_gpu.template nds_power.py parquet_sf3k ./query_streams/query_0.sql time_gpu.csv
from https://github.com/NVIDIA/spark-rapids-benchmarks/tree/dev/ndsGetting a java.lang.NoSuchMethod exception:
====== Run query96 ======
Traceback (most recent call last):
File "/home/volzok/Src/spark-rapids-benchmarks/nds/nds_power.py", line 376, in
run_query_stream(args.input_prefix,
File "/home/volzok/Src/spark-rapids-benchmarks/nds/nds_power.py", line 257, in run_query_stream
summary = q_report.report_on(run_one_query,spark_session,
File "/home/volzok/Src/spark-rapids-benchmarks/nds/PysparkBenchReport.py", line 79, in report_on
listener.register()
File "/home/volzok/Src/spark-rapids-benchmarks/nds/python_listener/PythonListener.py", line 27, in register
self.uuid = manager.register(self)
File "/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in call
File "/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py", line 179, in deco
File "/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:com.nvidia.spark.rapids.listener.Manager.register.
: java.lang.NoSuchMethodError: scala.collection.immutable.Map$.apply(Lscala/collection/Seq;)Lscala/collection/GenMap;
at com.nvidia.spark.rapids.listener.Manager$.(Manager.scala:8)
at com.nvidia.spark.rapids.listener.Manager$.(Manager.scala)
at com.nvidia.spark.rapids.listener.Manager.register(Manager.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:750)
Expected behavior
Class found, normal termination, the same result as for a CPU run (runs)
Environment details (please complete the following information)
Spark local execution, here is power_run_gpu.template:
source base.template
export CONCURRENT_GPU_TASKS=${CONCURRENT_GPU_TASKS:-2}
export SHUFFLE_PARTITIONS=${SHUFFLE_PARTITIONS:-200}
export SCALA_HOME=/home/volzok/.sdkman/candidates/scala/current
export SCALA_LIBRARY_PATH=$SCALA_HOME/lib
CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-library.jar"
CLASSPATH+=":$SCALA_LIBRARY_PATH/scala-compiler.jar"
CLASSPATH+=":$SCALA_LIBRARY_PATH/jline.jar"
export CLASSPATH
export SPARK_CONF=("--master" "${SPARK_MASTER}"
"--deploy-mode" "client"
"--conf" "spark.driver.maxResultSize=2GB"
"--conf" "spark.driver.memory=${DRIVER_MEMORY}"
"--conf" "spark.executor.cores=${EXECUTOR_CORES}"
"--conf" "spark.executor.instances=${NUM_EXECUTORS}"
"--conf" "spark.executor.memory=${EXECUTOR_MEMORY}"
"--conf" "spark.sql.shuffle.partitions=${SHUFFLE_PARTITIONS}"
"--conf" "spark.sql.files.maxPartitionBytes=2gb"
"--conf" "spark.sql.adaptive.enabled=true"
"--conf" "spark.executor.resource.gpu.amount=1"
"--conf" "spark.executor.resource.gpu.discoveryScript=./getGpusResources.sh"
"--conf" "spark.plugins=com.nvidia.spark.SQLPlugin"
"--conf" "spark.rapids.memory.host.spillStorageSize=32G"
"--conf" "spark.rapids.memory.pinnedPool.size=8g"
"--conf" "spark.rapids.sql.concurrentGpuTasks=${CONCURRENT_GPU_TASKS}"
"--conf" "spark.sql.legacy.charVarcharAsString=true"
"--files" "$SPARK_HOME/examples/src/main/scripts/getGpusResources.sh"
"--jars" "$SPARK_RAPIDS_PLUGIN_JAR,$NDS_LISTENER_JAR,$SCALA_LIBRARY_PATH/jline-3.21.0.jar,$SCALA_LIBRARY_PATH/scala-compiler.jar,$SCALA_LIBRARY_PATH/scala-library.jar”)
$ env|fgrep SPARK
SPARK_RAPIDS_PLUGIN_JAR=/home/volzok/bin/rapids-4-spark_2.13-23.12.0-SNAPSHOT-cuda11.jar
SPARK_HOME=/home/volzok/spark-3.5.0-bin-hadoop3-scala2.13
SPARK_MASTER=local
$ env|fgrep JAVA
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
Additional context
Running on a A100 DGX Station with CUDA 12.0
The text was updated successfully, but these errors were encountered: