Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/syslog] expose the syslog parser config from the syslog receiver #36906

Conversation

namco1992
Copy link
Contributor

Description

The syslog receiver relies on the stanza syslog input. Among other inputs, the syslog input is a bit special because it's a wrapper of a TCP/UDP input with a syslog parser. The user can't configure the syslog parser because it's not exposed from the syslog input config, which means that some useful features like the on_error mode or parse_from/to is unavailable if you are using syslog receiver.

This PR adds the parser config to the syslog input config under the parser field. I chose not to flatten it because the existing InputConfig.WriterConfig will conflict with the ParserConfig.TransformerConfig.WriterConfig. We can override it later in the Build func, but I guess it's a bit implicit and error-prone, so I choose to add a new parser field. I'm open to changing it if the owners think otherwise.

This change should be backward-compatible.

Testing

Tested locally with newly added config.

Documentation

Added the parser config documentation to the syslog receiver README.

…eiver

The syslog receiver relies on the stanza syslog input. Among other
inputs, the syslog input is a bit special because it's a wrapper of a
TCP/UDP input with a syslog parser. The user can't configure the syslog
parser because it's not exposed from the syslog input config, which
means that some useful features like the `on_error` mode or
`parse_from/to` is unavailable if you are using syslog receiver.

This PR adds the parser config to the syslog input config, under
`parser` field. I choose not to flatten it because the existing
`InputConfig.WriterConfig` will be conflicted with the
`ParserConfig.TransformerConfig.WriterConfig`. We can overrides it later
in the `Build` func, but I guess it's a bit implicit and error-prone,
therefore I choose to add a new `parser` field. I'm open to change it if
the owners think otherwise.

This change should be backward-compatible.
@namco1992
Copy link
Contributor Author

https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/12425577954/job/34692561655?pr=36906
It seems the TestComponentConfigStruct is unhappy with the ParserConfig. I indeed didn't find another example that exposes the ParserConfig on the receiver level, is it a controversial thing to do? Would appreciate any guidance on this.

Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree that there's any reason to expose most of these settings. The only one that makes any sense to me is on_error because it can provide useful behavior options.

All of the other fields suggested here are for things that instruct the parser where to find things, which is already defined due to the behavior of the tcp and udp inputs.

If you want to customize behavior further, why not use tcp or udp receivers and configure the syslog parser manually?

@namco1992
Copy link
Contributor Author

Hi @djaglowski, thanks for the reply.

The only one that makes any sense to me is on_error because it can provide useful behavior options.

If this makes some sense, then is it possible to at least expose this config?

All of the other fields suggested here are for things that instruct the parser where to find things, which is already defined due to the behavior of the tcp and udp inputs.

Can you elaborate more on this point? The TCP/UDP inputs didn't really "define" any parser config other than init a default config. I'm also not sure why it would be a bad thing to grant users the power to config the parser.

If you want to customize behavior further, why not use tcp or udp receivers and configure the syslog parser manually?

Yes, I guess we would have to go down that route if this can't happen upstream. However, there is some config that we can't touch from the outside, like setting the SplitFuncBuilder to OctetSplitFuncBuilder:

if syslogParserCfg.EnableOctetCounting {
tcpInputCfg.SplitFuncBuilder = OctetSplitFuncBuilder
}

CMIIW, but it seems we can't really have a 1-1 replication of the syslog receiver by combining TCP/UDP receiver with syslog parser.

Lastly, the syslog_input documentation claims that it supports https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/f21f7183cafdb2ec61fbb23692e9ee37b4f871c2/pkg/stanza/docs/operators/syslog_parser.md#configuration-fields, but that's not true since the parser config is not configurable.

| `syslog` | required | A [syslog parser config](./syslog_parser.md#configuration-fields) to defined syslog_parser operator. |

I understand that it might be a niche request, but I believe it's a reasonable one(or probably because I needed it 🙊 ). I'm ok with exposing the ones that you think make sense.

Copy link
Contributor

github-actions bot commented Jan 5, 2025

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jan 5, 2025
@namco1992
Copy link
Contributor Author

Hi @djaglowski, is there anything I can do to reopen this discussion? Or should I consider it as a "no no"? Thanks!

@github-actions github-actions bot removed the Stale label Jan 10, 2025
@djaglowski
Copy link
Member

The only one that makes any sense to me is on_error because it can provide useful behavior options.

If this makes some sense, then is it possible to at least expose this config?

Yes, I am in support of exposing this.

All of the other fields suggested here are for things that instruct the parser where to find things, which is already defined due to the behavior of the tcp and udp inputs.

Can you elaborate more on this point? The TCP/UDP inputs didn't really "define" any parser config other than init a default config.

The syslog input embeds the tcp & udp inputs, which means there is no ambiguity as to where the syslog parser should look when trying to find the syslog. (e.g. syslog comes into tcp input -> tcp input places it in the body -> syslog parser look for it in the body.)

I'm also not sure why it would be a bad thing to grant users the power to config the parser.

The settings are key to the seamless integration between the inputs and the parser. What good is exposing parse_from if the value is 100% known? It only complicates the configuration and introduces a way for things to not work correctly.

However, there is some config that we can't touch from the outside, like setting the SplitFuncBuilder to OctetSplitFuncBuilder:

if syslogParserCfg.EnableOctetCounting {
tcpInputCfg.SplitFuncBuilder = OctetSplitFuncBuilder
}

You can configure this by setting enable_octet_counting: true. It's right there in the code you linked: if syslogParserCfg.EnableOctetCounting

Lastly, the syslog_input documentation claims that it supports https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/f21f7183cafdb2ec61fbb23692e9ee37b4f871c2/pkg/stanza/docs/operators/syslog_parser.md#configuration-fields, but that's not true since the parser config is not configurable.

It supports all the fields in the syslog parser's "BaseConfig" which as you've noted excludes the fields that are defined in a generic parser.

@namco1992
Copy link
Contributor Author

namco1992 commented Jan 16, 2025

You can configure this by setting enable_octet_counting: true. It's right there in the code you linked: if syslogParserCfg.EnableOctetCounting

Let's rewind a bit so I can clarify my understanding. IIUC, the alternative you suggested is TCP/UDP log receiver + syslog parser, and the config could look like this:

receivers:
  tcplog:
    listen_address: 0.0.0.0:601
    operators:
      - type: syslog_parser
        protocol: rfc5424
        enable_octet_counting: true
        parse_to: body

During initialization, the tcp_input and syslog_parser are initialized separately. The SplitFuncBuilder is not exposed as a config, it will be nil and set to a default one.

if c.SplitFuncBuilder == nil {
c.SplitFuncBuilder = c.defaultSplitFuncBuilder
}

The syslog_parser is then initialized as usual, with enable_octet_counting: true, but that won't affect how tcp_input builds the SplitFunc.

What I linked in the previous comment is from syslog_input, which is what we want to mimic with tcp_input and syslog_parser. However, the config in syslog_parser won't affect how tcp_input is initialized. We can expose the SplitFuncBuilder config from tcp_input, but as of now I think the statement below still stands:

it seems we can't really have a 1-1 replication of the syslog receiver by combining TCP/UDP receiver with syslog parser.

For the rest of your comment, I don't really have a strong opinion, I will update my PR to only expose the on_error so we can get the ball rolling first.

@djaglowski
Copy link
Member

What I linked in the previous comment is from syslog_input, which is what we want to mimic with tcp_input and syslog_parser.

I still cannot understand why you would need an alternative. on_error is the only use case I've comprehended here. If you can articulate why another field that is not currently exposed would ever need to be customized, then we can talk about exposing it. Otherwise, you should just be able to use syslog_input.

@namco1992
Copy link
Contributor Author

Yup sure, I guess since you mentioned that possibility then I got too obsessed with it 😂 Will PR add the on_error field only.

Copy link
Contributor

github-actions bot commented Feb 7, 2025

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Feb 7, 2025
djaglowski pushed a commit that referenced this pull request Feb 13, 2025
…37847)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
As per the discussion in
#36906,
this PR supports setting the `on_error` mode in syslog receiver that
controls the behaviour of the underlying `syslog_parser` when it
encounters an error.

<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Closes
#36906.

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Tested setting `on_error` in the syslog receiver config.

<!--Describe the documentation added.-->
#### Documentation
Documented the `on_error` entry in both the syslog input and receiver
config.
<!--Please delete paragraphs that you did not use before submitting.-->

Signed-off-by: Mengnan Gong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants