-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark 3.4 + Arrow Datafusion Shuffle Manager Fails due to class loader isolation #221
Comments
I suspect that the correct fix is a documentation note in the README (maybe + a try/catch in the code to print out a reference to the README) since changing the Spark class loader is not easy (I also tried with user classpath first class loader). If folks agree happy to make a PR. We could also (maybe?) get at Spark's internal class loader and explicitly use it but that also seems very hack-ey |
Emmm, it could be a potential solution. But it seems a bit of inconvenient. Per my understanding, it usually requires extra effort to change Spark's jar directory/archive in the production environment.
So this issue occurred regardless the |
True, especially for users of a vendor solution, although for my deployments this isn't a big deal (we package our own Spark version anyways). Let me take another look next week and see if there is a way to get loaded with Sparks default class loader.
Now I only tried in vanilla 3.4. |
Thanks for working on this. |
Following on I tried adding |
Ah, I remembered this option. I think it would be great to update the doc to include this option. One thing more, I think you also need to mention |
Describe the bug
When trying to run using org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager it fails due to class loader isolation.
Steps to reproduce
/home/holden/repos/high-performance-spark-examples/spark-3.4.2-bin-hadoop3/bin/spark-sql --master 'local[5]' --conf spark.eventLog.enabled=true --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.apache.comet.CometSparkSessionExtensions --conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog --conf spark.sql.catalog.spark_catalog.type=hive --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog --conf spark.sql.catalog.local.type=hadoop --conf spark.sql.catalog.local.warehouse=/home/holden/repos/high-performance-spark-examples/warehouse --jars /home/holden/repos/high-performance-spark-examples/accelerators/arrow-datafusion-comet/spark/target/comet-spark-spark3.4_2.12-0.1.0-SNAPSHOT.jar --conf spark.comet.enabled=true --conf spark.comet.exec.enabled=true --conf spark.comet.exec.all.enabled=true --conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager --conf spark.comet.exec.shuffle.enabled=true --conf spark.comet.columnar.shuffle.enabled=true --conf spark.driver.userClassPathFirst=true --name sql/wap.sql -f sql/wap.sql
I think anything triggering a sort would suffice for repro but just in case my wap.sql here is:
This results in:
Expected behavior
I expect the query to run.
The expected output is:
Additional context
You can work around this error by instead of using --jars to add the arrow datafusion comet jar to the classpath instead copying it into Spark's jar directory so it will be loaded with the same classloader.
The text was updated successfully, but these errors were encountered: