Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NEXT-911] Issue tracking OpenTelemetry support #47660

Open
jankaifer opened this issue Mar 29, 2023 · 17 comments
Open

[NEXT-911] Issue tracking OpenTelemetry support #47660

jankaifer opened this issue Mar 29, 2023 · 17 comments
Labels
bug Issue was opened via the bug report template. Performance Anything with regards to Next.js performance.

Comments

@jankaifer
Copy link
Contributor

jankaifer commented Mar 29, 2023

A place to track current OTel support for easier cooperation with other projects.

From SyncLinear.com | NEXT-911

@jankaifer jankaifer added type: needs investigation bug Issue was opened via the bug report template. labels Mar 29, 2023
@jankaifer jankaifer changed the title Issue tracing OpenTelemetry support [NEXT-911] Issue tracing OpenTelemetry support Mar 29, 2023
@jankaifer
Copy link
Contributor Author

jankaifer commented Mar 29, 2023

PR where we added docs and simple examples: #47194 (comment)

EDIT: It's merged now and available in our docs.

@jankaifer
Copy link
Contributor Author

jankaifer commented Mar 29, 2023

Node SDK for OTel has somewhat stabilized.
We can move to it later, but there are a few issues with SDK and our Collector setup.
#47194 (comment)

@jankaifer jankaifer changed the title [NEXT-911] Issue tracing OpenTelemetry support [NEXT-911] Issue tracking OpenTelemetry support Mar 30, 2023
@jankaifer
Copy link
Contributor Author

Our instrumentation doesn't work well with BasicTraceProvider and you need to use NodeTraceProvider.

It won't be able to track active context and it won't be able to connect current spans to it's parent span.

@dyladan
Copy link

dyladan commented Apr 6, 2023

Our instrumentation doesn't work well with BasicTraceProvider and you need to use NodeTraceProvider.

It won't be able to track active context and it won't be able to connect current spans to it's parent span.

Is this saying basic tracer provider doesn't track context but node tracer provider does? The context manager is separate from the tracer provider and both should be useable if you manually specify the appropriate context manager

@jankaifer
Copy link
Contributor Author

I switched to NodeSDK in docs. It didn't work because it used BatchedSpanProcessor by default, and we were not flushing spans at the end of the request.
After explicitly using SimpleSpanProcessor, it works 👍

@jankaifer jankaifer self-assigned this Apr 12, 2023
@jamesloosli
Copy link

jamesloosli commented Apr 13, 2023

Would this be the right place to track issues relating to propagation? I used the docs to write a custom the registerOtel() function like so;

import {
  AlwaysOffSampler,
  CompositePropagator,
  ParentBasedSampler,
  TraceIdRatioBasedSampler,
} from '@opentelemetry/core';
import { B3InjectEncoding, B3Propagator } from '@opentelemetry/propagator-b3';
import {
  BatchSpanProcessor,
  NodeTracerProvider,
} from '@opentelemetry/sdk-trace-node';
import { DiagConsoleLogger, DiagLogLevel, diag } from '@opentelemetry/api';

import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { Resource } from '@opentelemetry/resources';
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';

export const registerOTel = (serviceName: string) => {
  // For troubleshooting, set the log level to DiagLogLevel.DEBUG
  diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG);
  const provider = new NodeTracerProvider({
    sampler: new ParentBasedSampler({
      root: new TraceIdRatioBasedSampler(0.1),
    }),
    resource: new Resource({
      [SemanticResourceAttributes.SERVICE_NAME]: serviceName,
    }),
  });

  const exporter = new OTLPTraceExporter({
    url: 'http://opentelemetry-collector.monitoring:4318/v1/traces',
  });

  // XXX: This is the pain point so far.
  provider.register({
    propagator: new CompositePropagator({
      propagators: [
        new B3Propagator(),
        new B3Propagator({ injectEncoding: B3InjectEncoding.MULTI_HEADER }),
      ],
    }),
  });

  provider.addSpanProcessor(
    new BatchSpanProcessor(exporter, {
      // The maximum queue size. After the size is reached spans are dropped.
      maxQueueSize: 100,
      // The maximum batch size of every export. It must be smaller or equal to maxQueueSize.
      maxExportBatchSize: 10,
      // The interval between two consecutive exports
      scheduledDelayMillis: 500,
      // How long the export can run before it is cancelled
      exportTimeoutMillis: 30000,
    })
  );

  return provider;
};

So far traces are getting exported properly to opentelemetry-collector, but they aren't getting the parentTraceId that's being set by the load balancer.

@jankaifer
Copy link
Contributor Author

jankaifer commented Apr 14, 2023

This issue was meant for chatting with OpenTelemetry people. For bug reports, please make a new issue.

I've made one for you: #48384
cc: @jamesloosli

@jankaifer jankaifer removed their assignment May 6, 2023
@devmanbr
Copy link

how stable is opentelemetry support?

please provide a way in the documentation where we can follow up. telemetry aspects are essential for commercial use applications.

@Multiply
Copy link

Multiply commented Aug 9, 2023

NextJS doesn't seem to extract incoming headers traceparent and tracestate.

@Ankcorn
Copy link

Ankcorn commented Oct 18, 2023

Hey, how do I pass in the traceparent header to the patched fetch within next.js?

@Multiply
Copy link

This might be fixed with #57084 but I haven't tested it yet.

@tachang
Copy link

tachang commented Nov 3, 2023

I am having some issues getting the spans to be associated with the parent trace (#57911). Is this a known issue already?

@gilmarsquinelato
Copy link

Hello everyone, I would like to help people who come to this issue searching for how we can propagate the trace in the patched fetch, like @Ankcorn.

After a while, I found a workaround until we have a definitive solution:

fetch('<URL>', {
  headers: {
    get traceparent() {
      const traceId = '';
      const spanId = '';
      return `00-${traceId}-${spanId}-01`;
    }
  },
})

Notice that the traceparent is a getter, so it will be read while the nextjs patched fetch clones the headers.
Also, it's important to not retrieve the traceId and spanId outside the getter since NextJS creates a separate span for the fetch internally.

I created these functions to retrieve the traceId and spanId:

import type { SpanContext } from '@opentelemetry/api';
import { isSpanContextValid, trace, context } from '@opentelemetry/api';

const getSpanContext = (): SpanContext | undefined => {
  const span = trace.getSpan(context.active());
  if (!span) {
    return undefined;
  }
  const spanContext = span.spanContext();
  if (!isSpanContextValid(spanContext)) {
    return undefined;
  }
  return spanContext;
};

export const getTraceId = (): string | undefined => {
  const spanContext = getSpanContext();
  return spanContext?.traceId;
};

export const getSpanId = (): string | undefined => {
  const spanContext = getSpanContext();
  return spanContext?.spanId;
};

export const getTraceparentHeader = () => {
  const spanContext = getSpanContext();
  if (!spanContext) return '';

  return `00-${spanContext.traceId}-${spanContext.spanId}-01`;
};

I hope this helps you somehow 😄

@gilmarsquinelato
Copy link

gilmarsquinelato commented Nov 7, 2023

I don't know if it's on my setup or others are seeing the same, but in my development environment it is logging empty lines instead of page compilation.

  ▲ Next.js 14.0.1
  - Local:        http://localhost:3001
  - Environments: .env
  - Experiments (use at your own risk):
    · instrumentationHook

○ Compiling /instrumentation ...
⚠ 


✓ Ready in 5.1s

Update:
After digging a bit I found the file that's causing this, if you simply import @opentelemetry/instrumentation you already have this issue happening.
And then I tried to track which file exactly is, and it's @opentelemetry/instrumentation/build/src/platform/node/instrumentation, also if you only import it the bug happens, don't need to do anything else.

Since the file does a bunch of imports I tried to manually import them separately, and they aren't causing the issue.
But two functions inside the InstrumentationBase class are, _warnOnPreloadedModules and _extractPackageVersion.

Another thing that I realized, it looks like a compilation issue, because I made a conditional require of this module, where even being a falsey condition it would be included in the transpiled files. After this, even the code not being in fact executed but only being required inside an if, the logs were buggy.

So to summarize:
instrumentation.ts

export const register = () => {
  const isProduction = process.env.NODE_ENV === 'production'; // intentionally set as a variable instead of an inline check in the if

  if (process.env.NEXT_RUNTIME === 'nodejs' && isProduction) {
    require('@opentelemetry/instrumentation/build/src/platform/node/instrumentation');
  }
};

@balazsorban44 balazsorban44 added Performance Anything with regards to Next.js performance. and removed area: Tracing labels Apr 17, 2024
huozhi added a commit that referenced this issue May 15, 2024
…ta to the client (#64256)

### What?

This PR adds an experimental option `clientTraceMetadata` that will use
the existing OpenTelemetry functionality to propagate conventional
OpenTelemetry trace information to the client.

The propagation metadata is propagated to the client via meta tags,
having a `name` and a `content` attribute containing the value of the
tracing value:

```html
<html>
    <head>
        <meta name="baggage" content="key1=val1,key2=val2">
        <meta name="traceparent" content="00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01">
        <meta name="custom" content="foobar">
    </head>
</html>
```

The implementation adheres to OpenTelemetry as much as possible,
treating the meta tags as if they were tracing headers on outgoing
requests. The `clientTraceMetadata` will contain the keys of the
metadata that're going to injected for tracing purpose.

### Why?

Telemetry providers usually want to provide visibility across the entire
stack, meaning it is useful for users to be able to associate, for
example, web vitals on the client, with a span tree on the server. In
order to be able to correlate tracing events from the front- and
backend, it is necessary to share something like a trace ID or similar,
that the telemetry providers can pick up and stitch back together to
create a trace.

### How?

The tracer was extended with a method `getTracePropagationData()` that
returns the propagation data on the currently active OpenTelemetry
context.
We are using `makeGetServerInsertedHTML()` to inject the meta tags into
the HTML head for dynamic requests.
The meta tags are generated through using the newly added
`getTracePropagationData()` method on the tracer.

It is important to mention that **the trace information should only be
propagated for the initial loading of the page, including hard
navigations**. Any subsequent operations should not propagate trace data
from the server to the client, as the client generally is the root of
the trace. The exception is initial pageloads, since while the request
starts on the client, no JS has had the opportunity to run yet, meaning
there is no trace propagation on the client before the server hasn't
responded.

Situations that we do not want tracing information to be propagated from
the server to the client:
- _Prefetch requests._ Prefetches generally start on the client and are
already instrumented.
- _Any sort of static precomputation, including PPR._ If we include
trace information in static pages, it means that all clients that will
land on the static page will be part of the "precomputation" trace. This
would lead to gigantic traces with a ton of unrelated data that is not
useful. The special case is dev mode where it is likely fine to
propagate trace information, even for static content, since it is
usually not actually static in dev mode.
- _Clientside (soft) navigations._ Navigations start on the client and
are usually already instrumented.

### Alternatives considered

An implementation that purely lives in user-land could have been
implemented with `useServerInsertedHTML()`, however, that implementation
would be cumbersome for users to set up, since the implementation of
tracing would have to happen in a) the instrumentation hook, b) in a
client-component that is used in a top-level layout.

### Related issues/discussions

- #47660
- #62353 (Could be used as
an alternative to the server-timing header)
- getsentry/sentry-javascript#9571

---------

Co-authored-by: Jiachi Liu <[email protected]>
panteliselef pushed a commit to panteliselef/next.js that referenced this issue May 20, 2024
…ta to the client (vercel#64256)

### What?

This PR adds an experimental option `clientTraceMetadata` that will use
the existing OpenTelemetry functionality to propagate conventional
OpenTelemetry trace information to the client.

The propagation metadata is propagated to the client via meta tags,
having a `name` and a `content` attribute containing the value of the
tracing value:

```html
<html>
    <head>
        <meta name="baggage" content="key1=val1,key2=val2">
        <meta name="traceparent" content="00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01">
        <meta name="custom" content="foobar">
    </head>
</html>
```

The implementation adheres to OpenTelemetry as much as possible,
treating the meta tags as if they were tracing headers on outgoing
requests. The `clientTraceMetadata` will contain the keys of the
metadata that're going to injected for tracing purpose.

### Why?

Telemetry providers usually want to provide visibility across the entire
stack, meaning it is useful for users to be able to associate, for
example, web vitals on the client, with a span tree on the server. In
order to be able to correlate tracing events from the front- and
backend, it is necessary to share something like a trace ID or similar,
that the telemetry providers can pick up and stitch back together to
create a trace.

### How?

The tracer was extended with a method `getTracePropagationData()` that
returns the propagation data on the currently active OpenTelemetry
context.
We are using `makeGetServerInsertedHTML()` to inject the meta tags into
the HTML head for dynamic requests.
The meta tags are generated through using the newly added
`getTracePropagationData()` method on the tracer.

It is important to mention that **the trace information should only be
propagated for the initial loading of the page, including hard
navigations**. Any subsequent operations should not propagate trace data
from the server to the client, as the client generally is the root of
the trace. The exception is initial pageloads, since while the request
starts on the client, no JS has had the opportunity to run yet, meaning
there is no trace propagation on the client before the server hasn't
responded.

Situations that we do not want tracing information to be propagated from
the server to the client:
- _Prefetch requests._ Prefetches generally start on the client and are
already instrumented.
- _Any sort of static precomputation, including PPR._ If we include
trace information in static pages, it means that all clients that will
land on the static page will be part of the "precomputation" trace. This
would lead to gigantic traces with a ton of unrelated data that is not
useful. The special case is dev mode where it is likely fine to
propagate trace information, even for static content, since it is
usually not actually static in dev mode.
- _Clientside (soft) navigations._ Navigations start on the client and
are usually already instrumented.

### Alternatives considered

An implementation that purely lives in user-land could have been
implemented with `useServerInsertedHTML()`, however, that implementation
would be cumbersome for users to set up, since the implementation of
tracing would have to happen in a) the instrumentation hook, b) in a
client-component that is used in a top-level layout.

### Related issues/discussions

- vercel#47660
- vercel#62353 (Could be used as
an alternative to the server-timing header)
- getsentry/sentry-javascript#9571

---------

Co-authored-by: Jiachi Liu <[email protected]>
ForsakenHarmony pushed a commit that referenced this issue Aug 14, 2024
…ta to the client (#64256)

This PR adds an experimental option `clientTraceMetadata` that will use
the existing OpenTelemetry functionality to propagate conventional
OpenTelemetry trace information to the client.

The propagation metadata is propagated to the client via meta tags,
having a `name` and a `content` attribute containing the value of the
tracing value:

```html
<html>
    <head>
        <meta name="baggage" content="key1=val1,key2=val2">
        <meta name="traceparent" content="00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01">
        <meta name="custom" content="foobar">
    </head>
</html>
```

The implementation adheres to OpenTelemetry as much as possible,
treating the meta tags as if they were tracing headers on outgoing
requests. The `clientTraceMetadata` will contain the keys of the
metadata that're going to injected for tracing purpose.

Telemetry providers usually want to provide visibility across the entire
stack, meaning it is useful for users to be able to associate, for
example, web vitals on the client, with a span tree on the server. In
order to be able to correlate tracing events from the front- and
backend, it is necessary to share something like a trace ID or similar,
that the telemetry providers can pick up and stitch back together to
create a trace.

The tracer was extended with a method `getTracePropagationData()` that
returns the propagation data on the currently active OpenTelemetry
context.
We are using `makeGetServerInsertedHTML()` to inject the meta tags into
the HTML head for dynamic requests.
The meta tags are generated through using the newly added
`getTracePropagationData()` method on the tracer.

It is important to mention that **the trace information should only be
propagated for the initial loading of the page, including hard
navigations**. Any subsequent operations should not propagate trace data
from the server to the client, as the client generally is the root of
the trace. The exception is initial pageloads, since while the request
starts on the client, no JS has had the opportunity to run yet, meaning
there is no trace propagation on the client before the server hasn't
responded.

Situations that we do not want tracing information to be propagated from
the server to the client:
- _Prefetch requests._ Prefetches generally start on the client and are
already instrumented.
- _Any sort of static precomputation, including PPR._ If we include
trace information in static pages, it means that all clients that will
land on the static page will be part of the "precomputation" trace. This
would lead to gigantic traces with a ton of unrelated data that is not
useful. The special case is dev mode where it is likely fine to
propagate trace information, even for static content, since it is
usually not actually static in dev mode.
- _Clientside (soft) navigations._ Navigations start on the client and
are usually already instrumented.

An implementation that purely lives in user-land could have been
implemented with `useServerInsertedHTML()`, however, that implementation
would be cumbersome for users to set up, since the implementation of
tracing would have to happen in a) the instrumentation hook, b) in a
client-component that is used in a top-level layout.

- #47660
- #62353 (Could be used as
an alternative to the server-timing header)
- getsentry/sentry-javascript#9571

---------

Co-authored-by: Jiachi Liu <[email protected]>
ForsakenHarmony pushed a commit that referenced this issue Aug 15, 2024
…ta to the client (#64256)

This PR adds an experimental option `clientTraceMetadata` that will use
the existing OpenTelemetry functionality to propagate conventional
OpenTelemetry trace information to the client.

The propagation metadata is propagated to the client via meta tags,
having a `name` and a `content` attribute containing the value of the
tracing value:

```html
<html>
    <head>
        <meta name="baggage" content="key1=val1,key2=val2">
        <meta name="traceparent" content="00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01">
        <meta name="custom" content="foobar">
    </head>
</html>
```

The implementation adheres to OpenTelemetry as much as possible,
treating the meta tags as if they were tracing headers on outgoing
requests. The `clientTraceMetadata` will contain the keys of the
metadata that're going to injected for tracing purpose.

Telemetry providers usually want to provide visibility across the entire
stack, meaning it is useful for users to be able to associate, for
example, web vitals on the client, with a span tree on the server. In
order to be able to correlate tracing events from the front- and
backend, it is necessary to share something like a trace ID or similar,
that the telemetry providers can pick up and stitch back together to
create a trace.

The tracer was extended with a method `getTracePropagationData()` that
returns the propagation data on the currently active OpenTelemetry
context.
We are using `makeGetServerInsertedHTML()` to inject the meta tags into
the HTML head for dynamic requests.
The meta tags are generated through using the newly added
`getTracePropagationData()` method on the tracer.

It is important to mention that **the trace information should only be
propagated for the initial loading of the page, including hard
navigations**. Any subsequent operations should not propagate trace data
from the server to the client, as the client generally is the root of
the trace. The exception is initial pageloads, since while the request
starts on the client, no JS has had the opportunity to run yet, meaning
there is no trace propagation on the client before the server hasn't
responded.

Situations that we do not want tracing information to be propagated from
the server to the client:
- _Prefetch requests._ Prefetches generally start on the client and are
already instrumented.
- _Any sort of static precomputation, including PPR._ If we include
trace information in static pages, it means that all clients that will
land on the static page will be part of the "precomputation" trace. This
would lead to gigantic traces with a ton of unrelated data that is not
useful. The special case is dev mode where it is likely fine to
propagate trace information, even for static content, since it is
usually not actually static in dev mode.
- _Clientside (soft) navigations._ Navigations start on the client and
are usually already instrumented.

An implementation that purely lives in user-land could have been
implemented with `useServerInsertedHTML()`, however, that implementation
would be cumbersome for users to set up, since the implementation of
tracing would have to happen in a) the instrumentation hook, b) in a
client-component that is used in a top-level layout.

- #47660
- #62353 (Could be used as
an alternative to the server-timing header)
- getsentry/sentry-javascript#9571

---------

Co-authored-by: Jiachi Liu <[email protected]>
ForsakenHarmony pushed a commit that referenced this issue Aug 16, 2024
…ta to the client (#64256)

This PR adds an experimental option `clientTraceMetadata` that will use
the existing OpenTelemetry functionality to propagate conventional
OpenTelemetry trace information to the client.

The propagation metadata is propagated to the client via meta tags,
having a `name` and a `content` attribute containing the value of the
tracing value:

```html
<html>
    <head>
        <meta name="baggage" content="key1=val1,key2=val2">
        <meta name="traceparent" content="00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01">
        <meta name="custom" content="foobar">
    </head>
</html>
```

The implementation adheres to OpenTelemetry as much as possible,
treating the meta tags as if they were tracing headers on outgoing
requests. The `clientTraceMetadata` will contain the keys of the
metadata that're going to injected for tracing purpose.

Telemetry providers usually want to provide visibility across the entire
stack, meaning it is useful for users to be able to associate, for
example, web vitals on the client, with a span tree on the server. In
order to be able to correlate tracing events from the front- and
backend, it is necessary to share something like a trace ID or similar,
that the telemetry providers can pick up and stitch back together to
create a trace.

The tracer was extended with a method `getTracePropagationData()` that
returns the propagation data on the currently active OpenTelemetry
context.
We are using `makeGetServerInsertedHTML()` to inject the meta tags into
the HTML head for dynamic requests.
The meta tags are generated through using the newly added
`getTracePropagationData()` method on the tracer.

It is important to mention that **the trace information should only be
propagated for the initial loading of the page, including hard
navigations**. Any subsequent operations should not propagate trace data
from the server to the client, as the client generally is the root of
the trace. The exception is initial pageloads, since while the request
starts on the client, no JS has had the opportunity to run yet, meaning
there is no trace propagation on the client before the server hasn't
responded.

Situations that we do not want tracing information to be propagated from
the server to the client:
- _Prefetch requests._ Prefetches generally start on the client and are
already instrumented.
- _Any sort of static precomputation, including PPR._ If we include
trace information in static pages, it means that all clients that will
land on the static page will be part of the "precomputation" trace. This
would lead to gigantic traces with a ton of unrelated data that is not
useful. The special case is dev mode where it is likely fine to
propagate trace information, even for static content, since it is
usually not actually static in dev mode.
- _Clientside (soft) navigations._ Navigations start on the client and
are usually already instrumented.

An implementation that purely lives in user-land could have been
implemented with `useServerInsertedHTML()`, however, that implementation
would be cumbersome for users to set up, since the implementation of
tracing would have to happen in a) the instrumentation hook, b) in a
client-component that is used in a top-level layout.

- #47660
- #62353 (Could be used as
an alternative to the server-timing header)
- getsentry/sentry-javascript#9571

---------

Co-authored-by: Jiachi Liu <[email protected]>
@Hronom
Copy link

Hronom commented Aug 27, 2024

Hello, any plan to improve OpenTelemetry integration in 2024? Seems like there still some issues when trace not fully propagated and part of this issues that you can't properly add header to the call's that being made to other microservices without breaking de-duplication.
Strange that there no possibility to override globally generation of hash for de-duplication and for example exclude some headers from being a part of it.

Any possibility to resolve this fully without workaround? Something like out-of-the-box? Maybe in Next.js 15?
I will be very happy if someone jump in and say: hey you are wrong it's already supported well in Next.js 14. Or at least it will be fixed in Next.js 15.

OpenTelemetry becomes a modern standard de-facto for observebility, strange that it not fully supported here.
@jankaifer I would like to see developers of Next.js being proactive here and not fully rely on maintainers of OpenTelemetry.

Thanks!

@jankaifer
Copy link
Contributor Author

Hi, I've left Vercel in May 2023. The best person to ask about Open Telemetry support in Next.js is probably @feedthejim or @timneutkens.

@Hronom
Copy link

Hronom commented Aug 27, 2024

Thanks @jankaifer, sorry for pinging, but I think this is a legacy that you get=).

@feedthejim or @timneutkens tagging you to get attention of someone from Vercel to this topic and comment that I left here #47660 (comment)

Just in case if it out of your radar, because, as I mention - OpenTelemetry becomes a standard de-facto for observability and unfortunately there still issues when you try to integrate it. More on this, if you have front-end based on Next.js it screws-up tracing in general, so back-end services on other languages that have good support of OpenTelemetry(like java) struggling from Next.js based front-end=).

So please give it some more priority, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue was opened via the bug report template. Performance Anything with regards to Next.js performance.
Projects
None yet
Development

No branches or pull requests

10 participants