diff --git a/README.md b/README.md index 8d49decf..e01e4bc4 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,11 @@ # Lighter -Lighter is an opensource application for interacting with [Apache Spark](https://spark.apache.org/) on [Kubernetes](https://kubernetes.io/) or [Apache Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html). It is hevily inspired by [Apache Livy](https://livy.incubator.apache.org/) and has some overlaping features. +Lighter is an opensource application for interacting with [Apache Spark](https://spark.apache.org/) on [Kubernetes](https://kubernetes.io/) or [Apache Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html). It is heavily inspired by [Apache Livy](https://livy.incubator.apache.org/) and has some overlapping features. Lighter supports: - Interactive Python Sessions through [Sparkmagic](https://github.com/jupyter-incubator/sparkmagic) kernel - Batch application submissions through the REST API -> :warning: **If you are using interactive sessions**: While we have tested batch applications quite extensively, there might be some problems with interactive sessions, consider current release of Lighter as alpha. - You can read a breaf description on how Lighter works [here](./docs/architecture.md). ## Using Lighter diff --git a/docs/configuration.md b/docs/configuration.md index c2ef2edf..16a6556d 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -22,19 +22,20 @@ Lighter can be configured by using environment variables. Currently, Lighter sup | LIGHTER_STORAGE_JDBC_DRIVER_CLASS_NAME | JDBC driver class name | org.h2.Driver | | LIGHTER_BATCH_DEFAULT_CONF | Default `conf` props for batch applications (JSON)* | | | LIGHTER_SESSION_DEFAULT_CONF | Default `conf` props for session applications (JSON) | | +| LIGHTER_SESSION_PERMANENT_SESSIONS | List of configurations for [permanent sessions](./permanent_sessions.md) | "[]" | -* default confs will be merged with confs provided in submit request, if property is defined in submit request, default will be ignored. +* default configs will be merged with configss provided in submit request, if property is defined in submit request, default will be ignored. Example of `LIGHTER_BATCH_DEFAULT_CONF`: `{"spark.kubernetes.driverEnv.TEST1":"test1"}`. ## Kubernetes configuration -| Property | Description | Default | -| ---------------------------------- | ---------------------------------------------------- |------------------------------------------------| -| LIGHTER_KUBERNETES_ENABLED | Kubernetes enabled | false | -| LIGHTER_KUBERNETES_MASTER | Kubernetes master URL | k8s://kubernetes.default.svc.cluster.local:443 | -| LIGHTER_KUBERNETES_NAMESPACE | Kubernetes namespace | spark | -| LIGHTER_KUBERNETES_MAX_LOG_SIZE | Max lines of log to store on DB | 500 | -| LIGHTER_KUBERNETES_SERVICE_ACCOUNT | Kubernetes service account | spark | +| Property | Description | Default | +|------------------------------------|---------------------------------|------------------------------------------------| +| LIGHTER_KUBERNETES_ENABLED | Kubernetes enabled | false | +| LIGHTER_KUBERNETES_MASTER | Kubernetes master URL | k8s://kubernetes.default.svc.cluster.local:443 | +| LIGHTER_KUBERNETES_NAMESPACE | Kubernetes namespace | spark | +| LIGHTER_KUBERNETES_MAX_LOG_SIZE | Max lines of log to store on DB | 500 | +| LIGHTER_KUBERNETES_SERVICE_ACCOUNT | Kubernetes service account | spark | ## YARN configuration diff --git a/docs/permanent_sessions.md b/docs/permanent_sessions.md new file mode 100644 index 00000000..154aaae0 --- /dev/null +++ b/docs/permanent_sessions.md @@ -0,0 +1,35 @@ +# Permanent sessions + +The Lighter has a feature for permanent interactive sessions. These sessions are useful for cases where there is +a need for a session that runs indefinitely and where occasional session statements need to be submitted directly +through the REST API. + +The Lighter takes care of maintaining the continuity of these sessions, ensuring they remain active and restarting +them in case of failures without altering the session identifier. + +It's important to note that in some cases, when a session is restarted due to a failure, it may not restore the previous +state. Therefore, it's advisable to make your statements independent of the previous session state. + +## Configuration + +Permanent sessions can be configured through by setting `LIGHTER_SESSION_PERMANENT_SESSIONS` environment variable. +Example value: +```json +[ + { + "id": "permanent-id-used-on-api-calls", + "submit-params": { + "name": "Session Name", + "numExecutors": 4, + "executorCores": 2, + "executorMemory": "2G", + "driverCores": 2, + "driverMemory": "1G", + "conf": { + "spark.eventLog.enabled": true, + "spark.eventLog.dir": "s3a://your_bucket/spark-hs/" + } + } + } +] +``` diff --git a/docs/rest.md b/docs/rest.md index 94512dae..f0b3d060 100644 --- a/docs/rest.md +++ b/docs/rest.md @@ -23,7 +23,7 @@ Request Exapmple: "numExecutors": 4, "executorCores": 2, "executorMemory": "2G", - "dirverCores": 2, + "driverCores": 2, "driverMemory": "1G", "args": ["arg1", "arg2"], "pyFiles": ["https://something/python_package.zip"],