-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a simulate ingest api #101409
Merged
Merged
Adding a simulate ingest api #101409
Changes from 34 commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
be5f759
Adding a simulate ingest API
masseyke 61aeb42
unit testing
masseyke f1fb903
cleaning up
masseyke c363a19
more testing
masseyke cb92a65
Changing response format
masseyke 3aa9b66
Update docs/changelog/101409.yaml
masseyke be086d2
Merge branch 'main' into adding-simulate-ingest-api
masseyke 32d77ba
Merge branch 'adding-simulate-ingest-api' of github.com:masseyke/elas…
masseyke 7e33d21
Updating docs
masseyke ce0bbd6
Merge branch 'main' into adding-simulate-ingest-api
masseyke 4ad2eb5
updating docs
masseyke 8fc1e85
fixing merge error
masseyke 896c3c3
fixing permissions for docs test
masseyke c8f5dd6
fixing docs
masseyke 2312f12
adding yaml rest tests
masseyke 91f6fa2
fixing action name
masseyke 4195e17
spotlessApply
masseyke 8556fc1
improving rest tests
masseyke 33e094f
spotlessApply
masseyke 2926257
fixing yaml test
masseyke fa9dd35
Merge branch 'main' into adding-simulate-ingest-api
masseyke 39eefd3
adding comments, fixing transport handling
masseyke c960c3c
cleanup
masseyke 3e3a426
fixing action name
masseyke 0a92123
fixing security test
masseyke b9f402b
fixing response serialization
masseyke 25083fc
Merge branch 'main' into adding-simulate-ingest-api
masseyke 0fa2c56
Merge branch 'main' into adding-simulate-ingest-api
masseyke 6be766e
code review changes
masseyke fd12236
code review feedback
masseyke f7d356f
attempting to remove compiler warning
masseyke de01166
adding a unit test for SimulateIngestRestToXContentListener
masseyke 255a08b
making index a path parameter
masseyke c6e2216
renaming a method
masseyke 83f5efe
updating rest api spec and adding more tests
masseyke bf082e4
code review feedback on docs
masseyke 5cdc909
code review feedback on docs
masseyke e35c556
fixing docs
masseyke 822b6c6
Merge branch 'main' into adding-simulate-ingest-api
masseyke 0099cfa
code review feedback
masseyke 3364ce8
Merge branch 'main' into adding-simulate-ingest-api
masseyke ac6ccab
comment cleanup
masseyke File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 101409 | ||
summary: Adding a simulate ingest api | ||
area: Ingest Node | ||
type: feature | ||
issues: [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,358 @@ | ||
|
||
[[simulate-ingest-api]] | ||
=== Simulate ingest API | ||
++++ | ||
<titleabbrev>Simulate ingest</titleabbrev> | ||
++++ | ||
|
||
Executes ingest pipelines against a set of provided documents, optionally | ||
with substitute pipeline definitions. | ||
|
||
//// | ||
[source,console] | ||
---- | ||
PUT /_ingest/pipeline/my-pipeline | ||
{ | ||
"description" : "example pipeline to simulate", | ||
"processors": [ | ||
{ | ||
"set" : { | ||
"field" : "field1", | ||
"value" : "value1" | ||
} | ||
} | ||
] | ||
} | ||
|
||
PUT /_ingest/pipeline/my-final-pipeline | ||
{ | ||
"description" : "example final pipeline to simulate", | ||
"processors": [ | ||
{ | ||
"set" : { | ||
"field" : "field2", | ||
"value" : "value2" | ||
} | ||
} | ||
] | ||
} | ||
|
||
PUT /index | ||
{ | ||
"settings": { | ||
"index": { | ||
"default_pipeline": "my-pipeline", | ||
"final_pipeline": "my-final-pipeline" | ||
} | ||
} | ||
} | ||
---- | ||
// TESTSETUP | ||
//// | ||
|
||
[source,console] | ||
---- | ||
POST /_ingest/_simulate | ||
{ | ||
"docs": [ | ||
{ | ||
"_index": "index", | ||
masseyke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"_id": "id", | ||
"_source": { | ||
"foo": "bar" | ||
} | ||
}, | ||
{ | ||
"_index": "index", | ||
"_id": "id", | ||
"_source": { | ||
"foo": "rab" | ||
} | ||
} | ||
], | ||
"pipeline_substitutions": { <1> | ||
"my-pipeline": { | ||
"processors": [ | ||
{ | ||
"set": { | ||
"field": "field3", | ||
"value": "value3" | ||
} | ||
} | ||
] | ||
} | ||
} | ||
} | ||
---- | ||
|
||
<1> This replaces the existing `my-pipeline` pipeline with the contents given here for the duration of this request. | ||
|
||
[[simulate-ingest-api-request]] | ||
==== {api-request-title} | ||
|
||
`POST /_ingest/_simulate` | ||
|
||
`GET /_ingest/_simulate` | ||
|
||
`POST /_ingest/<target>/_simulate` | ||
|
||
`GET /_ingest/<target>/_simulate` | ||
|
||
[[simulate-ingest-api-prereqs]] | ||
==== {api-prereq-title} | ||
|
||
* If the {es} {security-features} are enabled, you must have the | ||
`index` or `create` <<privileges-list-indices,index privileges>> | ||
to use this API. | ||
|
||
[[simulate-ingest-api-desc]] | ||
==== {api-description-title} | ||
|
||
The simulate ingest API simulates ingesting data into an index. It | ||
executes the default and final pipeline for that index against a set | ||
of documents provided in the body of the request. If a pipeline | ||
contains a reroute processor, it follows that reroute processor to the | ||
masseyke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
new index, executing that index's pipelines as well the same way that | ||
a non-simulated ingest would. No data is indexed into ${es}. Instead, | ||
masseyke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
the transformed document is returned, along with the list of pipelines | ||
that have been executed and the name of the index where the document | ||
would have been indexed if this were not a simulation. This differs from | ||
the <<simulate-pipeline-api,simulate pipeline API>> in that you sepcify | ||
masseyke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
a single pipeline for that API, and it only runs that one pipeline. The | ||
simulate pipeline API is more useful for developing a single pipeline, | ||
while the simulate ingest API is more useful for troubleshooting the | ||
interaction of the various pipelines that get applied when ingeseting | ||
masseyke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
into an index. | ||
|
||
|
||
By default, the pipeline definitions that are currently in the system | ||
are used. But you can supply substitute pipeline definitions in the | ||
masseyke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
body of the request. These will be used in place of the pipeline | ||
definitions that are already in the system. This can be used to replace | ||
existing pipeline definitions or to create new ones. The pipeline | ||
substitutions are only used within this request. | ||
|
||
[[simulate-ingest-api-path-params]] | ||
==== {api-path-parms-title} | ||
|
||
`<target>`:: | ||
(Optional, string) | ||
The index to simulate ingesting into. This can be overridden by specifying an index | ||
on each document. If you provide a <target> in the request path, it is used for any | ||
documents that don’t explicitly specify an index argument. | ||
|
||
[[simulate-ingest-api-query-params]] | ||
==== {api-query-parms-title} | ||
|
||
`pipeline`:: | ||
(Optional, string) | ||
Pipeline to use as the default pipeline. This can be used to override the default pipeline | ||
of the index being ingested into. | ||
|
||
|
||
[role="child_attributes"] | ||
[[simulate-ingest-api-request-body]] | ||
==== {api-request-body-title} | ||
|
||
`docs`:: | ||
(Required, array of objects) | ||
Sample documents to test in the pipeline. | ||
+ | ||
.Properties of `docs` objects | ||
[%collapsible%open] | ||
==== | ||
`_id`:: | ||
(Optional, string) | ||
Unique identifier for the document. | ||
|
||
`_index`:: | ||
(Optional, string) | ||
Name of the index that the document will be ingested into. | ||
|
||
`_source`:: | ||
(Required, object) | ||
JSON body for the document. | ||
==== | ||
|
||
`pipeline_substitutions`:: | ||
(Optional, map of strings to objects) | ||
Map of pipeline IDs to substitute pipeline definition objects. | ||
+ | ||
.Properties of pipeline definition objects | ||
[%collapsible%open] | ||
==== | ||
include::put-pipeline.asciidoc[tag=pipeline-object] | ||
==== | ||
|
||
[[simulate-ingest-api-example]] | ||
==== {api-examples-title} | ||
|
||
|
||
[[simulate-ingest-api-pre-existing-pipelines-ex]] | ||
===== Use pre-existing pipeline definitions | ||
In this example the index `index` has a default pipeline called `my-pipeline` and a final | ||
pipeline called `my-final-pipeline`. Since both documents are being ingested into `index`, | ||
both pipelines are executed using the pipeline definitions that are already in the system. | ||
|
||
[source,console] | ||
---- | ||
POST /_ingest/_simulate | ||
{ | ||
"docs": [ | ||
{ | ||
"_index": "index", | ||
"_id": "123", | ||
"_source": { | ||
"foo": "bar" | ||
} | ||
}, | ||
{ | ||
"_index": "index", | ||
"_id": "456", | ||
"_source": { | ||
"foo": "rab" | ||
} | ||
} | ||
] | ||
} | ||
---- | ||
|
||
The API returns the following response: | ||
|
||
[source,console-result] | ||
---- | ||
{ | ||
"docs": [ | ||
{ | ||
"doc": { | ||
"_id": "123", | ||
"_index": "index", | ||
"_version": -3, | ||
"_source": { | ||
"field1": "value1", | ||
"field2": "value2", | ||
"foo": "bar" | ||
}, | ||
"executed_pipelines": [ | ||
"my-pipeline", | ||
"my-final-pipeline" | ||
] | ||
} | ||
}, | ||
{ | ||
"doc": { | ||
"_id": "456", | ||
"_index": "index", | ||
"_version": -3, | ||
"_source": { | ||
"field1": "value1", | ||
"field2": "value2", | ||
"foo": "rab" | ||
}, | ||
"executed_pipelines": [ | ||
"my-pipeline", | ||
"my-final-pipeline" | ||
] | ||
} | ||
} | ||
] | ||
} | ||
---- | ||
|
||
[[simulate-ingest-api-request-body-ex]] | ||
===== Specify a pipeline substitution in the request body | ||
In this example the index `index` has a default pipeline called `my-pipeline` and a final | ||
pipeline called `my-final-pipeline`. But a substitute definition of `my-pipeline` is | ||
provided in `pipeline_substitutions`. The substitute `my-pipeline` will be used in place of | ||
the `my-pipeline` that is in the system, and then the `my-final-pipeline` that is already | ||
defined in the system will be executed. | ||
|
||
[source,console] | ||
---- | ||
POST /_ingest/_simulate | ||
{ | ||
"docs": [ | ||
{ | ||
"_index": "index", | ||
"_id": "123", | ||
"_source": { | ||
"foo": "bar" | ||
} | ||
}, | ||
{ | ||
"_index": "index", | ||
"_id": "456", | ||
"_source": { | ||
"foo": "rab" | ||
} | ||
} | ||
], | ||
"pipeline_substitutions": { | ||
"my-pipeline": { | ||
"processors": [ | ||
{ | ||
"uppercase": { | ||
"field": "foo" | ||
} | ||
} | ||
] | ||
} | ||
} | ||
} | ||
---- | ||
|
||
The API returns the following response: | ||
|
||
[source,console-result] | ||
---- | ||
{ | ||
"docs": [ | ||
{ | ||
"doc": { | ||
"_id": "123", | ||
"_index": "index", | ||
"_version": -3, | ||
"_source": { | ||
"field2": "value2", | ||
"foo": "BAR" | ||
}, | ||
"executed_pipelines": [ | ||
"my-pipeline", | ||
"my-final-pipeline" | ||
] | ||
} | ||
}, | ||
{ | ||
"doc": { | ||
"_id": "456", | ||
"_index": "index", | ||
"_version": -3, | ||
"_source": { | ||
"field2": "value2", | ||
"foo": "RAB" | ||
}, | ||
"executed_pipelines": [ | ||
"my-pipeline", | ||
"my-final-pipeline" | ||
] | ||
} | ||
} | ||
] | ||
} | ||
---- | ||
|
||
//// | ||
[source,console] | ||
---- | ||
DELETE /index | ||
|
||
DELETE /_ingest/pipeline/* | ||
---- | ||
|
||
[source,console-result] | ||
---- | ||
{ | ||
"acknowledged": true | ||
} | ||
---- | ||
//// |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this and in the doc for the existing simulate, I'm surprised there is no mention of the simulation/testing aspect. In isolation, this reads as if it actually executes the pipeline (and thus indexes some data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a line below that reads
No data is indexed into $Elasticsearch
. I'll add a line about the intended use of this API at the top though.