Skip to content

Commit

Permalink
Add sample command for running example jobs in cluster mode
Browse files Browse the repository at this point in the history
  • Loading branch information
cbalci committed Mar 14, 2023
1 parent ca48434 commit a02e4a0
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 2 deletions.
18 changes: 17 additions & 1 deletion pinot-connectors/pinot-spark-2-connector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,23 @@ val data = spark.read
data.show(100)
```

For more examples, see `src/test/scala/example/ExampleSparkPinotConnectorTest.scala`
## Examples

There are more examples included in `src/test/scala/.../ExampleSparkPinotConnectorTest.scala`.
You can run the examples locally (e.g. using your IDE) in standalone mode by starting a local Pinot cluster. See: https://docs.pinot.apache.org/basics/getting-started/running-pinot-locally

You can also run the tests in _cluster mode_ using following command:
```shell
export SPARK_CLUSTER=<YOUR_YARN_OR_SPARK_CLUSTER>

# Edit the ExampleSparkPinotConnectorTest to get rid of `.master("local")` and rebuild the jar before running this command
spark-submit \
--class org.apache.pinot.connector.spark.datasource.ExampleSparkPinotConnectorTest \
--jars ./target/pinot-spark-2-connector-0.13.0-SNAPSHOT-shaded.jar \
--master $SPARK_CLUSTER \
--deploy-mode cluster \
./target/pinot-spark-2-connector-0.13.0-SNAPSHOT-tests.jar
```

Spark-Pinot connector uses Spark `DatasourceV2 API`. Please check the Databricks presentation for DatasourceV2 API;

Expand Down
18 changes: 17 additions & 1 deletion pinot-connectors/pinot-spark-3-connector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,23 @@ val data = spark.read
data.show(100)
```

For more examples, see `src/test/scala/.../ExampleSparkPinotConnectorTest.scala`
## Examples

There are more examples included in `src/test/scala/.../ExampleSparkPinotConnectorTest.scala`.
You can run the examples locally (e.g. using your IDE) in standalone mode by starting a local Pinot cluster. See: https://docs.pinot.apache.org/basics/getting-started/running-pinot-locally

You can also run the tests in _cluster mode_ using following command:
```shell
export SPARK_CLUSTER=<YOUR_YARN_OR_SPARK_CLUSTER>

# Edit the ExampleSparkPinotConnectorTest to get rid of `.master("local")` and rebuild the jar before running this command
spark-submit \
--class org.apache.pinot.connector.spark.v3.datasource.ExampleSparkPinotConnectorTest \
--jars ./target/pinot-spark-3-connector-0.13.0-SNAPSHOT-shaded.jar \
--master $SPARK_CLUSTER \
--deploy-mode cluster \
./target/pinot-spark-3-connector-0.13.0-SNAPSHOT-tests.jar
```

Spark-Pinot connector uses Spark `DatasourceV2 API`. Please check the Databricks presentation for DatasourceV2 API;

Expand Down

0 comments on commit a02e4a0

Please sign in to comment.