Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opentelemetry] Add OTLP intake E2E system tests #976

Merged
merged 21 commits into from
Apr 25, 2023

Conversation

songy23
Copy link
Member

@songy23 songy23 commented Mar 16, 2023

Description

Add basic E2E integration tests on OpenTelemetry Protocol (OTLP) intake endpoint.

  • JIRA tickets: OTEL-338 and OTEL-401.
  • Usage: ./build.sh java_otel && ./run.sh OTEL_TRACING_E2E

What's covered

  • Add java_otel application container: a Spring boot native HTTP server that implements weblog interfaces and instrumented with OTel-Java SDK + Docker file. For basic tests (GET /) it generates 1 fake publisher span and 1 server span that has a link to the publisher span.
  • Add a test runner that sends requests to the application container. After receiving the response, it retrieves traces from DD backend and validates them. For now it only works with java_otel.
  • OTel spans generated by the test runner and application container are submitted to DD via the DD OTLP intake endpoint.

What's next

  • Add agent ingestion and OTel Collector to OTel tests
  • Verify OTel generated 128 bits trace ID

Workflow

  1. ⚠️⚠️ Create your PR as draft (we're receiving lot of PR, it saves us lot of time) ⚠️⚠️
  2. Follows the style guidelines of this project (See how to easily lint the code)
  3. Work on you PR until the CI passes (if something not related to your task is failing, you can ignore it)
  4. Mark it as ready for review

NOTE: By default in PR only default scenario tests will be launched. Please refer to the documentation to run all scenarios in your PR if needed.

Once your PR is reviewed, you can merge it ! ❤️

@songy23 songy23 force-pushed the yang.song/otlp-intake-test branch 2 times, most recently from 3896647 to b944af9 Compare March 17, 2023 17:14
@songy23 songy23 force-pushed the yang.song/otlp-intake-test branch from b944af9 to f66b76a Compare March 17, 2023 17:27
@songy23 songy23 changed the title [WIP] [opentelemetry] Add OTLP intake E2E system tests [opentelemetry] Add OTLP intake E2E system tests Mar 17, 2023
@songy23 songy23 requested review from dineshg13 and a team March 17, 2023 20:11
@songy23 songy23 marked this pull request as ready for review March 17, 2023 20:11
@songy23 songy23 requested a review from a team as a code owner March 17, 2023 20:11
@songy23
Copy link
Member Author

songy23 commented Mar 17, 2023

/cc @lambrospetrou

# OpenTelemetry tracing end-to-end scenarios
otel_tracing_e2e_w3c = EndToEndScenario(
"OTEL_TRACING_E2E_W3C",
weblog_env={"DD_API_KEY": os.environ.get("DD_API_KEY"), "DD_SITE": os.environ.get("DD_SITE"),},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need these, since they are already set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually needed - we set dd api key and dd site for Agent containers but not Weblog containers.

@cbeauchesne
Copy link
Collaborator

FI, seen with @songy23 : this PR will introduce lot of internal change, let just wait for the another big PR to be closed (#958)

@songy23 songy23 marked this pull request as draft March 22, 2023 18:27
songy23 added 3 commits March 25, 2023 10:34
- Associate OTel trace id with request id
- Remove client side instrumentation
- Remove trace context propagation
- Support OTLP intake in proxy
Copy link
Member Author

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding off this PR to wait for #958.

# OpenTelemetry tracing end-to-end scenarios
otel_tracing_e2e_w3c = EndToEndScenario(
"OTEL_TRACING_E2E_W3C",
weblog_env={"DD_API_KEY": os.environ.get("DD_API_KEY"), "DD_SITE": os.environ.get("DD_SITE"),},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually needed - we set dd api key and dd site for Agent containers but not Weblog containers.

@songy23
Copy link
Member Author

songy23 commented Apr 14, 2023

Chatted with @cbeauchesne and we agreed to submit this PR first, so posted for review again and removed do-not-merge

@cbeauchesne cbeauchesne force-pushed the yang.song/otlp-intake-test branch from 0a8da8f to 230216c Compare April 20, 2023 14:49
@cbeauchesne cbeauchesne force-pushed the yang.song/otlp-intake-test branch from 1cafa88 to 80d089f Compare April 20, 2023 15:47
@cbeauchesne
Copy link
Collaborator

@songy23 the adjustments needed on interfaces/scenarios are done.

It does not work because the weblog crashes with this issue :

Exception in thread "main" java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:95)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:58)
	at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:65)
Caused by: java.lang.NullPointerException: Parameter specified as non-null is null: method okhttp3.Headers$Builder.add, parameter value
	at okhttp3.Headers$Builder.add(Headers.kt)
	at io.opentelemetry.exporter.internal.okhttp.OkHttpExporterBuilder.addHeader(OkHttpExporterBuilder.java:85)
	at io.opentelemetry.exporter.otlp.http.trace.OtlpHttpSpanExporterBuilder.addHeader(OtlpHttpSpanExporterBuilder.java:80)
	at com.datadoghq.springbootnative.App.main(App.java:30)
	... 8 more

@songy23
Copy link
Member Author

songy23 commented Apr 20, 2023

I got a different error with the latest changes:

Failed to export spans. The request could not be executed. Full error message: runner: Name or service not known

Looking

@songy23
Copy link
Member Author

songy23 commented Apr 21, 2023

@cbeauchesne I was able to run the otel test with my latest changes:

% ./run.sh OTEL_TRACING_E2E
============================================== test context ===============================================
Scenario: OTEL_TRACING_E2E
Logs folder: ./logs_otel_tracing_e2e
=========================================== test session starts ===========================================
collected 349 items / 348 deselected / 1 selected                                                         
Executing weblog warmup...
----------------------------------------------- tests setup -----------------------------------------------

tests/otel_tracing_e2e/test_e2e.py .

-------------------------------------- Wait for agent interface (5s) --------------------------------------
---------------------------------- Wait for Weblog stdout interface (0s) ----------------------------------
---------------------------- Wait for .Net tracer-managed logs interface (0s) -----------------------------

tests/otel_tracing_e2e/test_e2e.py .                                                                [100%]

----- generated xml file: /Users/yang.song/Github/system-tests/logs_otel_tracing_e2e/reportJunit.xml ------
==================================== 1 passed, 348 deselected in 9.05s ====================================

Could you give it another try when you get a chance? (Note you need to set DD_API_KEY and DD_APPLICATION_KEY because this is an E2E test)

@cbeauchesne
Copy link
Collaborator

strange, I have the issue again. I've added the job in the CI, we'll see if it's only my local setup or not.

@songy23
Copy link
Member Author

songy23 commented Apr 24, 2023

NVM this should be fixed in c143cd4

CI still failed (https://github.com/DataDog/system-tests/actions/runs/4788383870/jobs/8514919306?pr=976) because of

ocker.errors.ImageNotFound: 404 Client Error for http+docker://localhost/v1.41/images/create?tag=latest&fromImage=system_tests%2Fweblog: Not Found ("pull access denied for system_tests/weblog, repository does not exist or may require 'docker login': denied: requested access to the resource is denied")

Do I need to publish the new otel weblog image somewhere?

@songy23
Copy link
Member Author

songy23 commented Apr 24, 2023

Interestingly sometimes in CI super().__init__ in OpenTelemetryScenario isn't executed 80d089f#diff-3faa2d19131dbfc2dc8b0bbe64b4dbacb1229e5fcb77659b8eb724a2b5dcf708R420 and thus we need to copy some logic to self.__init__. Not sure if this is a known issue since EndToEndScenario does the same.

@songy23 songy23 force-pushed the yang.song/otlp-intake-test branch from e2a9635 to d42fdbd Compare April 24, 2023 21:59
@songy23
Copy link
Member Author

songy23 commented Apr 24, 2023

@cbeauchesne test-the-tests-open-telemetry passed: https://github.com/DataDog/system-tests/actions/runs/4791447134/jobs/8521939498?pr=976

@songy23
Copy link
Member Author

songy23 commented Apr 25, 2023

Thanks for the review!

@songy23 songy23 merged commit ccd292a into main Apr 25, 2023
@songy23 songy23 deleted the yang.song/otlp-intake-test branch April 25, 2023 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants