-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sample "input" packages #325
Conversation
categories: | ||
- custom | ||
policy_templates: | ||
- name: first_policy_template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't put the elasticsearch
block here. Should I add few/all options from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wouldn't see a reason not to support the same options. Unless there is something we want to deprecate from the data stream manifests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we have:
elasticsearch.index_template.settings
elasticsearch.index_template.mappings
elasticsearch.index_template.ingest_pipeline.name
elasticsearch.privileges.indices
As we don't know in advance what would be the target index/ingest pipeline, we can't set these properties:
elasticsearch.index_template.ingest_pipeline.name
elasticsearch.privileges.indices
Am I right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we don't know in advance what would be the target index/ingest pipeline, we can't set these properties:
elasticsearch.index_template.ingest_pipeline.name
elasticsearch.privileges.indices
Umm, I don't see these are used in current integrations repo. What are use cases for these options in integration packages? Could they apply in input packages?
In any case, maybe you are right with not adding the elasticsearch
block by now, so we start small and we see later possible use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, I don't see these are used in current integrations repo. What are use cases for these options in integration packages? Could they apply in input packages?
If you can find any package here, then most likely it's APM or Endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ on having 2-3 sample packages to derive the spec.
multi: true | ||
required: true | ||
show_user: true | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is a sample package but I wonder if an input package should even have defaults here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that "defaults" can limit the amount of work on the initial configuration. There might be cases where you don't know what to do and would like to kick off the ingesting pipeline with default settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep support for default
, but I wouldn't put any default paths
in the package for custom logs. I think that we can expect that a user knows at least what logs they want to collect :)
An example of a default
in a log input package could be to have a variable to support ignore_older
, that would default to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not real Custom logs
package. I rather wanted to present all options there, which seem to be valid in terms of spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you can have paths
as example without defaults, and ignore_older
as example with defaults 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
license: basic | ||
categories: | ||
- custom | ||
policy_templates: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do policy templates make sense for input packages? It is really convenient that we can reuse the same logic as we already have so I'm not necessarily asking to change anything here but have a discussion.
- Can an input package have more then one policy_templates?
- Does an input package need all features of policy_templates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does an input package need all features of policy_templates?
It looks like we can improve on this. These are the properties available in integration's policy_template
:
- name
- title
- categories
- description
- data_streams
- inputs
- multiple (?)
- icons
- screenshot
Clearly, we don't need all of them, so I assumed that this policy_template
(a different one in terms of spec) will contain a subset (or a different set) of properties. If you consider a different name for policy_template
, do you have any suggestions? maybe an input_template
?
Can an input package have more then one policy_templates?
Good question, but I believe that the answer might be the same as for policy_templates
. In most cases, we will need one. What should we do in the future if we scope this for one only :)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the spec allow to use the same name but for example if it is defined as input package, to not allow certain params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be totally new spec files here, so yes. We may consider extracting some common parts or referring to the integration
type temporarily (for example, to fields
properties).
Once we agree on the look, I will write down the spec files. I don't want to iterate on both at the same time as it usually means more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
categories: | ||
- custom | ||
policy_templates: | ||
- name: first_policy_template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wouldn't see a reason not to support the same options. Unless there is something we want to deprecate from the data stream manifests.
multi: true | ||
required: true | ||
show_user: true | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep support for default
, but I wouldn't put any default paths
in the package for custom logs. I think that we can expect that a user knows at least what logs they want to collect :)
An example of a default
in a log input package could be to have a variable to support ignore_older
, that would default to 0.
multi: false | ||
required: true | ||
show_user: false | ||
default: "variables" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For such cases, it would be great to have a pattern or allowed values. cc @joshdover
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played with patterns (as regexp) here #245. I would prefer a list of allowed values :)
Do you think that we need more sample packages or is it enough to start drafting spec files? |
{{#each tags}} | ||
- {{this}} | ||
{{/each}} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a few minimum additions to this, that needs to be in every input package, and many of them are in other integrations as well:
Pipelines:
{{#if pipeline}}
pipeline: {{pipeline}}
{{/if}}
Tags are slightly wrong, they should be like this:
{{#if tags}}
tags:
{{#each tags as |tag i|}}
- {{tag}}
{{/each}}
{{/if}}
Since an input can ingest data both from a local host and external host, we need to add support for removing the host fields:
{{#contains "forwarded" tags}}
publisher_pipeline.disable_host: true
{{/contains}}
And custom processors:
{{#if processors}}
processors:
{{processors}}
{{/if}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Custom processing has been intentionally left out of the MVP, it would also require custom mappings.
Also, depending on how this is implemented, this wouldn't require anything on the spec, the pipelines and the mappings could be configured in Fleet and installed by it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few of your ideas to the sample package.
description: Collect your custom log files. | ||
input: logfile | ||
template_path: input.yml.hbs | ||
vars: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have all the settings here, does that mean we won't have any datastream at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any data stream directories, but with Kibana UI you will be able to select/create the target data stream.
Co-authored-by: Marius Iversen <[email protected]>
Should we add one input package that has an open endpoint like http as an additional example? https://github.com/elastic/package-storage/tree/production/packages/http_endpoint/1.0.1 @andrewkroh Would be good to get your eyes also on this. |
I think the more useful would be |
httpjson is useful too. The reason I mentioned the http_endpoint is that it is different from the others as it is a push instead of pull endpoint. |
Ok, I think that 3 packages are sufficient to show our intention with input packages. From a developer perspective, I would say that the biggest difference is in immersing the data stream manifest in the package manifest. Apart from that, this is a relatively similar type to the I will keep the PR hanging until next week to collect more comments, then will focus on spec files. |
{{/if}} | ||
|
||
{{#if pipeline}} | ||
pipeline: {{pipeline}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this from here. Ok that this is only an example in the config, but it may look like if we support custom processing using pipelines, but is something that we are not doing in the MVP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we already support both processors and ingest pipelines in the current input packages, are we planning on removing them @jsoriano ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have settings in input-like packages that happen to support custom processing, but this is not completely integrated with the solution, and is not enforced to be coherent with other inputs, or to require field mappings for possible new fields that are generated by this processing.
If we follow this way, every input package needs to implement its own way to support custom processing and custom fields. Now that we start with a clean state for this new package type, we would like to provide a more integrated solutions for this.
Custom pipelines support is still an open discussion for packages in general.
We consider that inputs without any custom processing already provide value in use cases of centralized log collection, and we are currently planning to go in this direction for the initial MVP. Next steps after that would be towards supporting custom processing.
It is though a good question how we are going to migrate users from the current input-like integration packages to the new input packages. Depending on the answer we may need to maintain these settings for backwards compatibility, maybe deprecating them, and eventually removing them when we have a complete solution for this.
multi: false | ||
required: true | ||
show_user: false | ||
default: "variables" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played with patterns (as regexp) here #245. I would prefer a list of allowed values :)
This PR is a follow-up on the design document.
I will start with the sample package and once we agree on properties, I will write down the spec.