Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/azureeventhub] Use attributes/fields to represent Azure logs #16283

Closed
loomis-relativity opened this issue Nov 14, 2022 · 12 comments
Closed
Labels

Comments

@loomis-relativity
Copy link
Contributor

Component(s)

receiver/azureeventhub

Is your feature request related to a problem? Please describe.

The current implementation dumps the entire Azure log into the body of the OpenTelemetry log record as a byte array. This causes the log to be complete opaque and unusable in our storage backends (one of which happens to be New Relic).

Describe the solution you'd like

Given that the Azure log records are already structured and follow a standard format, we prefer that the Azure log was parsed and then injected into the OpenTelemetry log record as attributes.

Describe alternatives you've considered

None.

Additional context

If this is acceptable, we'd be willing to contribute to the implementation.

@loomis-relativity loomis-relativity added enhancement New feature or request needs triage New item requiring triage labels Nov 14, 2022
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@djaglowski
Copy link
Member

I think it makes sense that the receiver would be able to parse the logs into the OTel data model.

@loomis-relativity, any thoughts on how the configuration would need to change in order to support either mode?

@djaglowski djaglowski added priority:p2 Medium and removed needs triage New item requiring triage labels Nov 14, 2022
@djaglowski djaglowski changed the title use attributes/fields to represent Azure logs [receiver/azureeventhub] Use attributes/fields to represent Azure logs Nov 14, 2022
@atoulme
Copy link
Contributor

atoulme commented Nov 14, 2022

Great feedback, thanks! The kafka receiver uses the notion of codecs to allow setting up different behaviors, look at the encoding config key: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/kafkareceiver/README.md
I guess we can go with the same approach here and allow different behaviors - the current behavior would be called raw and this new one named azureresourcelog. Does that sound good?

@djaglowski
Copy link
Member

This makes sense to me. Which behavior should be the default? (I tend to think it should be whichever requires less user configuration in order to be useful.)

@atoulme
Copy link
Contributor

atoulme commented Nov 14, 2022

I think once the azureresourcelog has landed and is validated, it should be default, I'd make it a separate PR.

@loomis-relativity
Copy link
Contributor Author

@djaglowski @atoulme I think having a simple switch for the behavior is fine. Once the log has been translated to OpenTelemetry format, users can use processors to do further transformations if they need it; no need to add switches for that here.
I'm fine with either one being the default.

@atoulme
Copy link
Contributor

atoulme commented Nov 14, 2022

@loomis-relativity so to be super clear: your contribution is most welcome. That's a great development :D

@loomis-relativity
Copy link
Contributor Author

@atoulme We'll have to get it pulled into a sprint, but there's no problem in principle for us to do the work. Legal has already approved contributions to OpenTelemetry, so there shouldn't be further delays once we've got something. Don't wait on us though, if you want to tackle it.

@atoulme
Copy link
Contributor

atoulme commented Nov 14, 2022

There's several things to do, so let's just make a list here and whoever can take this can take some of it rather than the brunt of the work:

  • Add a config option allowing to change encoding, with one option raw that is default
  • Add a translation logic that does azure resource log to native otel log (this is where you can help the most @loomis-relativity)
  • Add a new codec azureresourcelog that uses this logic
  • Make azureresourcelog default config option

@loomis-relativity
Copy link
Contributor Author

loomis-relativity commented Nov 14, 2022

@atoulme @djaglowski Below is a proposed mapping between the Azure resource logs schema and OpenTelemetry. Comments?

Azure OpenTelemetry OpenTelemetry Location comments
time (required) time_unix_nano field  
resourceId (required) resourceId resource attribute map to service.name? see service semantic conventions
tenantId (required, tenant logs) tenantId resource attribute  
operationName (required) operationName attribute  
operationVersion (optional) operationVersion attribute  
category (optional) category attribute  
resultType (optional) resultType attribute  
resultSignature (optional) resultSignature attribute  
resultDescription (optional) resultDescription attribute  
durationMs (optional) duration attribute used as just "duration" in various places in semantic conventions with ms for units
callerIpAddress (optional) net.sock.peer.addr attribute see network semantic conventions; not clear if this is appropriate
correlationId (optional) trace_id field  
identity (optional) identity attribute (nested) add option for flattening? map to enduser.* attributes from semantic conventions?
Level (optional) severity_number, severity_text field For severity_number: "Informational" → SEVERITY_NUMBER_INFO; "Warning" → SEVERITY_NUMBER_WARN; "Error" → SEVERITY_NUMBER_ERROR; "Critical" → SEVERITY_NUMBER_FATAL; severity_text copied verbatim from Azure value
location (optional) cloud.region resource attribute see cloud semantic conventions
cloud.provider resource attribute see cloud semantic conventions
properties (optional) properties attribute (nested) add option for flattening?

@atoulme
Copy link
Contributor

atoulme commented Nov 14, 2022

At a glance, it makes sense to me. But I'm not the right person to review this. I recommend that you look at the existing mappings here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md#splunk-hec and open a PR under the specification repository with this table for review.

@loomis-relativity
Copy link
Contributor Author

@atoulme @djaglowski See the PR (#16357) for an initial implementation. I'll pull this into our custom collector to test it. Unfortunately, that will happen after the Thanksgiving break. In the meantime, any feedback on the PR is welcome. You should be able to modify the PR if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants