Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(api-server): fix trace/span ID processing in logs #10100

Conversation

bartsmykla
Copy link
Contributor

Problem:

Currently, processing trace/span IDs before logging them (e.g., converting them to Datadog format) requires calling NewSpanLogValuesProcessorContext with the runtime.Extensions() context. However, this context isn't always available, particularly when handling errors in REST endpoint handlers through AddFieldsFromCtx. This inconsistency leads to the processing function being ignored.

Proposed Solutions:

  • Pass runtime.Extensions() Context to HandleError: While effective, this approach adds significant boilerplate code and requires disabling the contextcheck linter in multiple locations (see PR chore(api-server): pass extensions context to rest endpoints #10094).

  • Leverage slog's Contextual Logging (ref: rework AddFieldsFromCtx to be less combersome #8216): This approach offers cleaner code but requires significant changes to our logging infrastructure (ones which I immediately faced when tried to do a quick POC):

    • slog lacks the concept of logger names, necessitating a custom slog.Handler for maintaining desired log structure.
    • We'd need to abandon using globally configured/named loggers or potentially introduce deferred logger initialization logic for slog.

Current Solution (For This PR):

To address the issue without a major logging overhaul, this PR proposes a simpler solution:

  • Pass the logger instance through the context when constructing ApiServer.
  • Restore the logger from the context in HandleError function.

This approach, while not ideal due to the passing logger via context, effectively resolves the problem without introducing complex logging changes.

Additional changes:

I adjusted the signature of NewApiServer function to accept the whole runtime.Runtime, which reduced significantly the amount of necessary arguments, which were attained from the runtime.Runtime. It makes contextcheck linter happy as we don't have to pass runtime.Extensions() context as an argument, which linter would flag as passed and not used, without realising it's not the regular context.

Checklist prior to review

  • Link to relevant issue as well as docs and UI issues
    • No relevant issues
  • This will not break child repos: it doesn't hardcode values (.e.g "kumahq" as a image registry) and it will work on Windows, system specific functions like syscall.Mkfifo have equivalent implementation on the other OS
    • It won't
  • Tests (Unit test, E2E tests, manual test on universal and k8s)
    • Manually tested by running local OTEL collector and then manually triggering handlers to return error with ant without span logging processors
    • Don't forget ci/ labels to run additional/fewer tests
  • Do you need to update UPGRADE.md?
    • There is no need
  • Does it need to be backported according to the backporting policy? (this GH action will add "backport" label based on these file globs, if you want to prevent it from adding the "backport" label use no-backport-autolabel label)
    • There is no need I think 🤔

Problem:
Currently, processing trace/span IDs before logging them (e.g.,
converting them to Datadog format) requires calling
`NewSpanLogValuesProcessorContext` with the `runtime.Extensions()`
context. However, this context isn't always available, particularly when
 handling errors in REST endpoint handlers through `AddFieldsFromCtx`.
 This inconsistency leads to the processing function being ignored.

Proposed Solutions:
- Pass `runtime.Extensions()` Context to `HandleError`: While effective,
 this approach adds significant boilerplate code and requires disabling
 the `contextcheck` linter in multiple locations (see PR 10094).

- Leverage slog's Contextual Logging (ref issue 8216): This approach
offers cleaner code but requires significant changes to our logging
infrastructure (ones which I immediately faced when tried to do a quick
POC):
  - slog lacks the concept of logger names, necessitating a custom
    `slog.Handler` for maintaining desired log structure.
  - We'd need to abandon using globally configured/named loggers or
  potentially introduce deferred logger initialization logic for slog.

Current Solution (For This PR):
To address the issue without a major logging overhaul, this PR proposes
a simpler solution:

- Pass the logger instance through the context when constructing
  ApiServer.
- Restore the logger from the context in `HandleError` function.

This approach, while not ideal due to the passing logger via context,
effectively resolves the problem without introducing complex logging
changes.

Additional changes:
I adjusted the signature of `NewApiServer` function to accept the whole
`runtime.Runtime`, which reduced significantly the amount of necessary
arguments, which were attained from the `runtime.Runtime`. It makes
`contextcheck` linter happy as we don't have to pass
`runtime.Extensions()` context as an argument, which linter would flag
as passed and not used, without realising it's not the regular
context.

Signed-off-by: Bart Smykla <[email protected]>
@bartsmykla bartsmykla requested a review from a team as a code owner April 27, 2024 08:56
@bartsmykla bartsmykla requested review from michaelbeaumont and removed request for a team April 27, 2024 08:56
Copy link
Contributor

@slonka slonka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this but I might be biased, I'll let someone else also review this

@slonka slonka merged commit 075c95e into kumahq:master May 6, 2024
15 checks passed
@bartsmykla bartsmykla deleted the fix/trace-span-ids-processing-in-endpoint-handlers-logs branch May 6, 2024 06:40
lobkovilya pushed a commit that referenced this pull request May 15, 2024
* fix(api-server): fix trace/span ID processing in logs

Problem:
Currently, processing trace/span IDs before logging them (e.g.,
converting them to Datadog format) requires calling
`NewSpanLogValuesProcessorContext` with the `runtime.Extensions()`
context. However, this context isn't always available, particularly when
 handling errors in REST endpoint handlers through `AddFieldsFromCtx`.
 This inconsistency leads to the processing function being ignored.

Proposed Solutions:
- Pass `runtime.Extensions()` Context to `HandleError`: While effective,
 this approach adds significant boilerplate code and requires disabling
 the `contextcheck` linter in multiple locations (see PR 10094).

- Leverage slog's Contextual Logging (ref issue 8216): This approach
offers cleaner code but requires significant changes to our logging
infrastructure (ones which I immediately faced when tried to do a quick
POC):
  - slog lacks the concept of logger names, necessitating a custom
    `slog.Handler` for maintaining desired log structure.
  - We'd need to abandon using globally configured/named loggers or
  potentially introduce deferred logger initialization logic for slog.

Current Solution (For This PR):
To address the issue without a major logging overhaul, this PR proposes
a simpler solution:

- Pass the logger instance through the context when constructing
  ApiServer.
- Restore the logger from the context in `HandleError` function.

This approach, while not ideal due to the passing logger via context,
effectively resolves the problem without introducing complex logging
changes.

Additional changes:
I adjusted the signature of `NewApiServer` function to accept the whole
`runtime.Runtime`, which reduced significantly the amount of necessary
arguments, which were attained from the `runtime.Runtime`. It makes
`contextcheck` linter happy as we don't have to pass
`runtime.Extensions()` context as an argument, which linter would flag
as passed and not used, without realising it's not the regular
context.

Signed-off-by: Bart Smykla <[email protected]>

* chore: add missing test.Runtime.Extensions()

Signed-off-by: Bart Smykla <[email protected]>

---------

Signed-off-by: Bart Smykla <[email protected]>
Signed-off-by: Ilya Lobkov <[email protected]>
michaelbeaumont added a commit that referenced this pull request Jun 4, 2024
We can't rely on the filter added in #10100 because we don't have the tenant at that point.

Signed-off-by: Mike Beaumont <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants