[DOC] Misleading and unclear documentation for the Spark Connector in the SQL/PPL docs #8212
Open
1 of 4 tasks
Labels
1 - Backlog - DEV
Developer assigned to issue is responsible for creating PR.
What do you want to do?
Tell us about your request.
Regarding: https://opensearch.org/docs/latest/search-plugins/sql/settings/#spark-connector-settings
The Spark connector is, according to this comment only supporting AWS EMR Serverless Spark (which means I need to have AWS credentials). This should be made clear in the docs.
The docs lacks examples how to setup EMR Serverless Spark and OpenSearch and where to provide the configuration (like
spark.uri
). For an user its unclear how to setup a basic working example.Some of the config properties lacks examples and the info which values are valid:
spark.uri
"The identifier for your Spark data source." is misleading, lacks example and what the default is and wether its mandatoryspark.auth.type
Its unclear which values are valid and what the default is and wether its mandatoryThe spark connector docs lacks an reference to https://opensearch.org/docs/latest/dashboards/management/data-sources/ (and potentially https://opensearch.org/docs/latest/dashboards/management/accelerate-external-data/) and an explanation and examples how to add spark as a datasource
The docs are not coherent with https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/admin/connectors/spark_connector.rst
emr.cluster
is missing for exampleThe ppl example is unclear
To what is
my_spark
referring to?Version:
all since Spark connector is supported
What other resources are available?
The text was updated successfully, but these errors were encountered: