Reorganise scripting docs (#18132)

* Reorganize scripting documentation * Further changes to tidy up scripting docs Closes #18116 * Add note about .lat/lon potentially returning null * Added .value to expressions example * Fixed two bad ASCIIDOC links
elastic · May 4, 2016 · 34d90b0 · 34d90b0
1 parent 5a0cfdd
commit 34d90b0
Show file tree

Hide file tree

Showing 11 changed files with 1,108 additions and 777 deletions.
diff --git a/docs/reference/modules.asciidoc b/docs/reference/modules.asciidoc
@@ -94,8 +94,6 @@ include::modules/network.asciidoc[]
 
 include::modules/node.asciidoc[]
 
-include::modules/painless.asciidoc[]
-
 include::modules/plugins.asciidoc[]
 
 include::modules/scripting.asciidoc[]

diff --git a/docs/reference/modules/scripting.asciidoc b/docs/reference/modules/scripting.asciidoc
@@ -1,5 +1,104 @@
-include::scripting/scripting.asciidoc[]
+[[modules-scripting]]
+== Scripting
 
-include::scripting/advanced-scripting.asciidoc[]
+The scripting module enables you to use scripts to evaluate custom
+expressions. For example, you could use a script to return "script fields"
+as part of a search request or evaluate a custom score for a query.
+
+TIP: Elasticsearch now has a built-in scripting language called _Painless_
+that provides a more secure alternative for implementing
+scripts for Elasticsearch. We encourage you to try it out --
+for more information, see <<modules-scripting-painless, Painless Scripting Language>>.
+
+The default scripting language is http://groovy-lang.org/[groovy].
+Additional `lang` plugins enable you to run scripts written in other languages.
+Everywhere a script can be used, you can include a `lang` parameter
+to specify the language of the script.
+
+[float]
+=== General-purpose languages:
+
+These languages can be used for any purpose in the scripting APIs,
+and give the most flexibility.
+
+[cols="<,<,<",options="header",]
+|=======================================================================
+|Language
+    |Sandboxed
+    |Required plugin
+
+|<<modules-scripting-painless, `painless`>>
+    |yes
+    |built-in
+
+|<<modules-scripting-groovy, `groovy`>>
+    |<<modules-scripting-security, no>>
+    |built-in
+
+|{plugins}/lang-javascript.html[`javascript`]
+    |<<modules-scripting-security, no>>
+    |{plugins}/lang-javascript.html[`lang-javascript`]
+
+|{plugins}/lang-python.html[`python`]
+    |<<modules-scripting-security, no>>
+    |{plugins}/lang-python.html[`lang-python`]
+
+|=======================================================================
+
+[float]
+=== Special-purpose languages:
+
+These languages are less flexible, but typically have higher performance for
+certain tasks.
+
+[cols="<,<,<,<",options="header",]
+|=======================================================================
+|Language
+    |Sandboxed
+    |Required plugin
+    |Purpose
+
+|<<modules-scripting-expression, `expression`>>
+    |yes
+    |built-in
+    |fast custom ranking and sorting
+
+|<<search-template, `mustache`>>
+    |yes
+    |built-in
+    |templates
+
+|<<modules-scripting-native, `java`>>
+    |n/a
+    |you write it!
+    |expert API
+
+|=======================================================================
+
+[WARNING]
+.Scripts and security
+=================================================
+
+Languages that are sandboxed are designed with security in mind. However, non-
+sandboxed languages can be a security issue, please read
+<<modules-scripting-security, Scripting and security>> for more details.
+
+=================================================
+
+
+include::scripting/using.asciidoc[]
+
+include::scripting/fields.asciidoc[]
 
 include::scripting/security.asciidoc[]
+
+include::scripting/groovy.asciidoc[]
+
+include::scripting/painless.asciidoc[]
+
+include::scripting/expression.asciidoc[]
+
+include::scripting/native.asciidoc[]
+
+include::scripting/advanced-scripting.asciidoc[]
+
diff --git a/docs/reference/modules/scripting/advanced-scripting.asciidoc b/docs/reference/modules/scripting/advanced-scripting.asciidoc
@@ -1,13 +1,17 @@
 [[modules-advanced-scripting]]
-=== Text scoring in scripts
+=== Advanced text scoring in scripts
 
+experimental[The functionality described on this page is considered experimental and may be changed or removed in a future release]
 
-Text features, such as term or document frequency for a specific term can be accessed in scripts (see <<modules-scripting, scripting documentation>> ) with the `_index` variable. This can be useful if, for example, you want to implement your own scoring model using for example a script inside a <<query-dsl-function-score-query,function score query>>.
+Text features, such as term or document frequency for a specific term can be
+accessed in scripts with the `_index` variable. This can be useful if, for
+example, you want to implement your own scoring model using for example a
+script inside a <<query-dsl-function-score-query,function score query>>.
 Statistics over the document collection are computed *per shard*, not per
 index.
 
 [float]
-==== Nomenclature:
+=== Nomenclature:
 
 
 [horizontal]
@@ -33,7 +37,7 @@ depending on the shard the current document resides in.
 
 
 [float]
-==== Shard statistics:
+=== Shard statistics:
 
 `_index.numDocs()`::
 
@@ -49,7 +53,7 @@ depending on the shard the current document resides in.
 
 
 [float]
-==== Field statistics:
+=== Field statistics:
 
 Field statistics can be accessed with a subscript operator like this:
 `_index['FIELD']`.
@@ -74,7 +78,7 @@ depending on the shard the current document resides in.
 The number of terms in a field cannot be accessed using the `_index` variable. See <<token-count>> for how to do that.
 
 [float]
-==== Term statistics:
+=== Term statistics:
 
 Term statistics for a field can be accessed with a subscript operator like
 this: `_index['FIELD']['TERM']`. This will never return null, even if term or field does not exist.
@@ -101,7 +105,7 @@ affect is your set the <<index-options,`index_options`>> to `docs`.
 
 
 [float]
-==== Term positions, offsets and payloads:
+=== Term positions, offsets and payloads:
 
 If you need information on the positions of terms in a field, call
 `_index['FIELD'].get('TERM', flag)` where flag can be
@@ -174,7 +178,7 @@ return score;
 
 
 [float]
-==== Term vectors:
+=== Term vectors:
 
 The `_index` variable can only be used to gather statistics for single terms. If you want to use information on all terms in a field, you must store the term vectors (see <<term-vector>>). To access them, call
 `_index.termVectors()` to get a

diff --git a/docs/reference/modules/scripting/expression.asciidoc b/docs/reference/modules/scripting/expression.asciidoc
@@ -0,0 +1,120 @@
+[[modules-scripting-expression]]
+=== Lucene Expressions Language
+
+Lucene's expressions compile a `javascript` expression to bytecode. They are
+designed for high-performance custom ranking and sorting functions and are
+enabled for `inline` and `stored` scripting by default.
+
+[float]
+=== Performance
+
+Expressions were designed to have competitive performance with custom Lucene code.
+This performance is due to having low per-document overhead as opposed to other
+scripting engines: expressions do more "up-front".
+
+This allows for very fast execution, even faster than if you had written a `native` script.
+
+[float]
+=== Syntax
+
+Expressions support a subset of javascript syntax: a single expression.
+
+See the link:http://lucene.apache.org/core/6_0_0/expressions/index.html?org/apache/lucene/expressions/js/package-summary.html[expressions module documentation]
+for details on what operators and functions are available.
+
+Variables in `expression` scripts are available to access:
+
+* document fields, e.g. `doc['myfield'].value`
+* variables and methods that the field supports, e.g. `doc['myfield'].empty`
+* Parameters passed into the script, e.g. `mymodifier`
+* The current document's score, `_score` (only available when used in a `script_score`)
+
+You can use Expressions scripts for `script_score`, `script_fields`, sort scripts, and numeric aggregation
+scripts, simply set the `lang` parameter to `expression`.
+
+[float]
+=== Numeric field API
+[cols="<,<",options="header",]
+|=======================================================================
+|Expression |Description
+|`doc['field_name'].value` |The value of the field, as a `double`
+
+|`doc['field_name'].empty` |A boolean indicating if the field has no
+values within the doc.
+
+|`doc['field_name'].min()` |The minimum value of the field in this document.
+
+|`doc['field_name'].max()` |The maximum value of the field in this document.
+
+|`doc['field_name'].median()` |The median value of the field in this document.
+
+|`doc['field_name'].avg()` |The average of the values in this document.
+
+|`doc['field_name'].sum()` |The sum of the values in this document.
+
+|`doc['field_name'].count()` |The number of values in this document.
+|=======================================================================
+
+When a document is missing the field completely, by default the value will be treated as `0`.
+You can treat it as another value instead, e.g. `doc['myfield'].empty ? 100 : doc['myfield'].value`
+
+When a document has multiple values for the field, by default the minimum value is returned.
+You can choose a different value instead, e.g. `doc['myfield'].sum()`.
+
+When a document is missing the field completely, by default the value will be treated as `0`.
+
+Boolean fields are exposed as numerics, with `true` mapped to `1` and `false` mapped to `0`.
+For example: `doc['on_sale'].value ? doc['price'].value * 0.5 : doc['price'].value`
+
+[float]
+=== Date field API
+Date fields are treated as the number of milliseconds since January 1, 1970 and
+support the Numeric Fields API above, with these additional methods:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Expression |Description
+|`doc['field_name'].getYear()` |Year component, e.g. `1970`.
+
+|`doc['field_name'].getMonth()` |Month component (0-11), e.g. `0` for January.
+
+|`doc['field_name'].getDayOfMonth()` |Day component, e.g. `1` for the first of the month.
+
+|`doc['field_name'].getHourOfDay()` |Hour component (0-23)
+
+|`doc['field_name'].getMinutes()` |Minutes component (0-59)
+
+|`doc['field_name'].getSeconds()` |Seconds component (0-59)
+|=======================================================================
+
+The following example shows the difference in years between the `date` fields date0 and date1:
+
+`doc['date1'].getYear() - doc['date0'].getYear()`
+
+[float]
+=== `geo_point` field API
+[cols="<,<",options="header",]
+|=======================================================================
+|Expression |Description
+|`doc['field_name'].empty` |A boolean indicating if the field has no
+values within the doc.
+
+|`doc['field_name'].lat` |The latitude of the geo point, or `null`.
+
+|`doc['field_name'].lon` |The longitude of the geo point, or `null`.
+|=======================================================================
+
+The following example computes distance in kilometers from Washington, DC:
+
+`haversin(38.9072, 77.0369, doc['field_name'].lat, doc['field_name'].lon)`
+
+In this example the coordinates could have been passed as parameters to the script,
+e.g. based on geolocation of the user.
+
+[float]
+=== Limitations
+
+There are a few limitations relative to other script languages:
+
+* Only numeric, boolean, date, and geo_point fields may be accessed
+* Stored fields are not available