Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cat endpoint for ingest pipelines #31954

Open
markwalkom opened this issue Jul 11, 2018 · 8 comments
Open

Cat endpoint for ingest pipelines #31954

markwalkom opened this issue Jul 11, 2018 · 8 comments
Labels
:Data Management/CAT APIs Text APIs behind /_cat :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team

Comments

@markwalkom
Copy link
Contributor

markwalkom commented Jul 11, 2018

Describe the feature:
It's be great if there was a _cat endpoint that showed defined pipelines.

It might mean we need to save a bit more info into cluster state with each pipeline, perhaps things like created + updated times. I don't know if there's more metric based info we are planning for these, but things like processed/failed events (for eg) might also be useful.

@markwalkom markwalkom added :Data Management/CAT APIs Text APIs behind /_cat :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP labels Jul 11, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@rjernst
Copy link
Member

rjernst commented Jul 19, 2018

We are discussing adding stats like numbers of processed/failed ingestions to existing stats apis. Given that the configured names are not generally useful from an operator point of view, I'm going to close this in favor of that future work.

@rjernst rjernst closed this as completed Jul 19, 2018
@inqueue
Copy link
Member

inqueue commented Mar 21, 2019

As the use of ingest pipelines picks up with Beats modules, so does the need to eventually purge older unused and unwanted pipelines. Having a _cat endpoint for pipelines could help with their discovery, like _cat/templates or _cat/aliases. GET /_ingest/pipelines from Kibana Dev Tools is ok, but it is only JSON output where sometimes only a list of pipelines is needed.

Consider an older cluster that was created with 5.6 that is now running 7.0. It would have these pipelines if it accepts data from the Filebeat Nginx module:

curl $CURL/_ingest/pipeline | jq '.|keys'
[
  "filebeat-5.6.15-nginx-access-default",
  "filebeat-5.6.15-nginx-error-pipeline",
  "filebeat-6.6.1-nginx-error-pipeline",
  "filebeat-6.6.2-nginx-access-default",
  "filebeat-6.6.2-nginx-error-pipeline",
  "filebeat-6.7.0-nginx-access-default",
  "filebeat-6.7.0-nginx-error-pipeline",
  "filebeat-7.0.0-rc1-nginx-access-default",
  "filebeat-7.0.0-rc1-nginx-error-pipeline",
  "xpack_monitoring_2",
  "xpack_monitoring_6",
  "xpack_monitoring_7"
]

That is 662 lines of JSON in the Dev Console. There are several pipelines here that need to be removed and most, if not all, could be unknown to the admin anyway. And this is just one Filebeat module. Other modules ship with their own pipelines, hence eventually some management is needed here (there is probably a separate issue needed for Beats).

Could we give this feature request another consideration?

cc @ruflin

@inqueue inqueue reopened this Mar 21, 2019
@martijnvg martijnvg removed the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Oct 11, 2019
@m9aertner
Copy link
Contributor

A /_cat/pipelines endpoint would also be nice for consistency with /_cat/aliases and others.

When there is no pipeline defined, /_ingest/pipeline as shown above causes a 404 error, whereas 200 OK is returned for no aliases. I prefer the latter, as it's more convenient for fail-fast scripting.

Note that /_ingest/pipeline/* and /_ingest/pipeline/my-* also show 404.

@rjernst rjernst added the Team:Data Management Meta label for data/management team label May 4, 2020
@jakelandis jakelandis added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Nov 12, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Ingest)

@danhermann
Copy link
Contributor

danhermann commented Mar 8, 2021

Note that a "summary" option was added to the existing GET _ingest/pipeline API in #69756 that addresses the need for a more concise listing of ingest pipelines although not with a new CAT endpoint. Please comment here if that does not address your use case as this issue is likely to be deprioritized since there is now a way to concisely list ingest pipelines.

@m9aertner
Copy link
Contributor

That sounds great!

For the record, I ended up using the _nodes/stats API to check if a specific pipeline exists or not. That has worked nicely for some time now. It is used as part of a CI/deployment pipeline to keep instances updated. Mostly still on ES 6.8, but works with 7.x, too. When absent, the pipeline will be created in a subsequent step.

$ http :9200/_nodes/stats groups==ingest | jq -r '.nodes | select(.!=null) | to_entries[].value.ingest.pipelines | has("my-pipeline-name")'
true

Note I had commented above only, I am not the original poster.

@dmase004
Copy link

This would be a great feature for us in our environment. The ability to keep track of certain pipelines would help us expand as we try to make new pipelines that reference old pipelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/CAT APIs Text APIs behind /_cat :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

9 participants