Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for metrics from the DataDog Agent #18278

Closed
dsobolev-nr opened this issue Feb 3, 2023 · 18 comments
Closed

Add support for metrics from the DataDog Agent #18278

dsobolev-nr opened this issue Feb 3, 2023 · 18 comments

Comments

@dsobolev-nr
Copy link

Component(s)

receiver/datadog

Is your feature request related to a problem? Please describe.

Customers of DataDog that wish to use OpenTelemetry on their data encounter a problem with the current receiver as it only supports trace and not metrics or logs.

Describe the solution you'd like

Customers of DataDog should be able to point any agent at an OpenTelemetry collector and process the metrics data emitted.

Describe alternatives you've considered

No response

Additional context

No response

@dsobolev-nr dsobolev-nr added enhancement New feature or request needs triage New item requiring triage labels Feb 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2023

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@jpkrohling
Copy link
Member

@dsobolev-nr, I'm sure @boostchicken will be happy to review PRs implementing this.

@jpkrohling jpkrohling removed the needs triage New item requiring triage label Feb 6, 2023
@boostchicken
Copy link
Member

Yeah absolutely, I don't know Datadogs format for metrics or logs, It doesn't use the trace agent format, I believe it goes direct to the API like in datadogexporter

@boostchicken
Copy link
Member

boostchicken commented Feb 7, 2023

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Apr 10, 2023
@mvadu
Copy link

mvadu commented Apr 25, 2023

https://docs.datadoghq.com/api/latest/logs/#send-logs

https://docs.datadoghq.com/api/latest/metrics/#submit-metrics

I believe these are the API's it would need to support

Those endpoints are more about if you want to send the metrics directly from code to DataDog HQ. But not about how the trace agent (dd-agent.jar for example) itself is sending them to datadog agent. I think this https://docs.datadoghq.com/tracing/metrics/metrics_namespace/ makes more send from trace metrics stand point.

@dsobolev-nr if you original usecase is related to custom metrics generated from your code using DataDog SDK, then it probably uses dogstatds method https://docs.datadoghq.com/metrics/custom_metrics/dogstatsd_metrics_submission/ and we have been successful in replacing the dogstatsd with statsd exporter and translate custom metrics into Prometheus metrics.

@github-actions github-actions bot removed the Stale label May 26, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 26, 2023
@boostchicken
Copy link
Member

boostchicken commented Aug 8, 2023

This a totally different datadog agent, The current receiver has t he Trace Agent API implemented, Metrics I do believe go directly to datadogs SaaS api.

https://docs.datadoghq.com/api/latest/metrics/#submit-metrics

That is the API we need to create receiver for. In theory not a ton of work. there are a lot of other APIs for querying and stuff that we won't since we don't store the data.

@MovieStoreGuy @jpkrohling

If that approach sounds good I could whip something pretty quickly.

@github-actions github-actions bot removed the Stale label Aug 9, 2023
@ledor473
Copy link
Contributor

ledor473 commented Aug 14, 2023

Based on the Vector implementation, it seems that datadog-agent relies on at least 3 APIs (one of which doesn't seems to be documented) to submit metrics to the SaaS:

/api/beta/sketches
/api/v1/series
/api/v2/series

@boostchicken
Copy link
Member

boostchicken commented Aug 15, 2023

Based on the Vector implementation, it seems that datadog-agent relies on at least 3 APIs (one of which doesn't seems to be documented) to submit metrics to the SaaS:

/api/beta/sketches
/api/v1/series
/api/v2/series

good research bud thanks!

Can you link the vector repo? They asked me a bunch about the reciever when I first sent the PR a while ago, and are very OSS friendly maybe we can borrow some of their code

That being said what do we want to implement here? We just want to take in their format for all of those endpoints and translate to OTel for use with any processor / exporter? Quite a bit of work, happy get it rolling once we agree on the requirements.

@jpkrohling @MovieStoreGuy thoughts?

@boostchicken
Copy link
Member

boostchicken commented Aug 15, 2023

Regarding attaching metrics to traces, I believe that is a specific featuere that will show metrics on the trace flame graph for correlation / single pane of glass and does not really mean metric ingestion at large, at least last time I uesd that was the case. Feel free to correct me, your statsd approach probably works well, but dogstatsd does have some protocol changes you might run into, should we look at enhancing statsreciever to support dogstatsd officially?

https://docs.datadoghq.com/api/latest/logs/#send-logs
https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
I believe these are the API's it would need to support

Those endpoints are more about if you want to send the metrics directly from code to DataDog HQ. But not about how the trace agent (dd-agent.jar for example) itself is sending them to datadog agent. I think this https://docs.datadoghq.com/tracing/metrics/metrics_namespace/ makes more send from trace metrics stand point.

@dsobolev-nr if you original usecase is related to custom metrics generated from your code using DataDog SDK, then it probably uses dogstatds method https://docs.datadoghq.com/metrics/custom_metrics/dogstatsd_metrics_submission/ and we have been successful in replacing the dogstatsd with statsd exporter and translate custom metrics into Prometheus metrics.

@ledor473
Copy link
Contributor

My understanding of the issue comes mainly from this part:

Customers of DataDog should be able to point any agent at an OpenTelemetry collector and process the metrics data emitted.

To me, this means that I could point a datadog-agent to an OpenTelemetry Collector rather than to Vector to process the metrics before being sent to Datadog SaaS. Something like this:

+---------------+       +--------------+       +------------------+                     
| datadog-agent |-------|   OTel Col   |-------| Datadog SaaS API |                         
+---------------+       +--------------+       +------------------+                     

That being said what do we want to implement here? We just want to take in their format for all of those endpoints and translate to OTel for use with any processor / exporter? Quite a bit of work, happy get it rolling once we agree on the requirements.
I believe that statement is correct, but again I'm not the reporter.

Some extra context: either of these Datadog guides explains how to achieve that "interception" between the datadog-agent and the SaaS:

@mvadu
Copy link

mvadu commented Aug 16, 2023

To me, this means that I could point a datadog-agent to an OpenTelemetry Collector rather than to Vector to process the metrics before being sent to Datadog SaaS.

In my usecase, we have some legacy code instrumented with datadog way of tracing, with few custom metrics. We want to keep that generation code as is, and being able to send those metrics to non datadog SaaS (due to cost reasons). So getting the metrics translated to a prometheus standard similar to the DataDog receiver work done for traces will be cool.

@ledor473
Copy link
Contributor

Can you link the vector repo?
@boostchicken

I believe the main function where the HTTP endpoint gets created is here: https://github.com/vectordotdev/vector/blob/master/src/sources/datadog_agent/metrics.rs#L37C15-L53

It's in Rust so I don't think we can reuse that easily but the logic to decode the message is in that metrics.rs file as well

@jpkrohling
Copy link
Member

I'm all for implementing those endpoints, perhaps using Vector as the reference implementation.

@tejaskokje-mw
Copy link

Another reference implementation is by Victoria Metrics here

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Nov 27, 2023
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

jpkrohling added a commit that referenced this issue Jul 3, 2024
**Description:**
This PR adds the initial structure required to add support for metrics
in the Datadog receiver. This is the first of several PRs which will add
support for v1 and v2 series endpoints, service checks, as well as
sketches.

The full version of the code can be found in the
`cedwards/datadog-metrics-receiver-full` branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

**Link to tracking Issue:** 
#18278 

**Testing:** 
Unit tests have been added. More thorough tests will be included in
follow-up PRs as the remaining functionality is added.

**Documentation:** 
Updated README

---------

Co-authored-by: Juraci Paixão Kröhling <[email protected]>
jpkrohling pushed a commit that referenced this issue Jul 8, 2024
**Description**:
This PR is a follow up to the former
#33631
extending the existing tags translation structure. This will be required
for the follow up PRs adding support for v1 and v2 series endpoints,
service checks, as well as sketches.

The full version of the code can be found in the
cedwards/datadog-metrics-receiver-full branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

**Link to tracking Issue:**

#18278

**Testing**:
Unit tests have been added. More thorough tests will be included in
follow-up PRs as the remaining functionality is added.

**Notes**:
- Adding `[chore]` to the title of the PR because
https://github.com/grafana/opentelemetry-collector-contrib/blob/ab4d726aaaa07aad702ff3b312a8e261f2b38021/.chloggen/datadogreceiver_metrics.yaml#L1-L27
already exists.

---------

Signed-off-by: Jesus Vazquez <[email protected]>
jpkrohling pushed a commit that referenced this issue Jul 17, 2024
**Description:**
This PR adds support for V1 series, as well as batches the metrics by
resource, scope, and datapoint attributes. The batching code will also
be required for future PRs which will add support for v2 series
endpoints, service checks, and sketches.

Follow up of #33631 and #33922.

The full version of the code can be found in the
`cedwards/datadog-metrics-receiver-full` branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

**Link to tracking Issue:** 
#18278 

**Testing:** 
Unit tests, as well as an end-to-end test, have been added.

---------

Co-authored-by: Federico Torres <[email protected]>
jpkrohling pushed a commit that referenced this issue Jul 19, 2024
…ackage (#34160)

**Description:**
This PR is a follow-up to #33957. It refactors the Datadog receiver
files to remove internal methods and structures from the public API and
into an internal directory.

**Link to tracking Issue:** 
#18278 

**Testing:** 
This is a refactor, so no new unit tests have been added.
jpkrohling pushed a commit that referenced this issue Aug 13, 2024
**Description:**
This PR adds support for Datadog V2 series.

Follow up of #33631 and #33957.

The full version of the code can be found in the
`cedwards/datadog-metrics-receiver-full` branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

**Link to tracking Issue:** 
#18278 

**Testing:** 
Unit tests, as well as an end-to-end test, have been added.
jpkrohling added a commit that referenced this issue Sep 9, 2024
Description:
This PR adds support for Datadog Service Checks.

Follow up of
#33631
,
#33957
and
#34180.

The full version of the code can be found in the
cedwards/datadog-metrics-receiver-full branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

Link to tracking Issue:

#18278

Testing:
Unit tests, as well as an end-to-end test, have been added.

---------

Signed-off-by: alexgreenbank <[email protected]>
Co-authored-by: Carrie Edwards <[email protected]>
Co-authored-by: Juraci Paixão Kröhling <[email protected]>
jpkrohling pushed a commit that referenced this issue Sep 9, 2024
**Description:**
This PR adds support for translating Datadog sketches into Exponential
Histograms.

Follow up of #33631, #33957 and #34180.

The full version of the code can be found in the
`cedwards/datadog-metrics-receiver-full` branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

**Link to tracking Issue:** 
#18278 

**Testing:** 
Unit tests, as well as an end-to-end test, have been added.

---------

Signed-off-by: Federico Torres <[email protected]>
Signed-off-by: György Krajcsovits <[email protected]>
Co-authored-by: Federico Torres <[email protected]>
Co-authored-by: György Krajcsovits <[email protected]>
f7o pushed a commit to f7o/opentelemetry-collector-contrib that referenced this issue Sep 12, 2024
…34180)

**Description:**
This PR adds support for Datadog V2 series.

Follow up of open-telemetry#33631 and open-telemetry#33957.

The full version of the code can be found in the
`cedwards/datadog-metrics-receiver-full` branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

**Link to tracking Issue:** 
open-telemetry#18278 

**Testing:** 
Unit tests, as well as an end-to-end test, have been added.
f7o pushed a commit to f7o/opentelemetry-collector-contrib that referenced this issue Sep 12, 2024
…etry#34474)

Description:
This PR adds support for Datadog Service Checks.

Follow up of
open-telemetry#33631
,
open-telemetry#33957
and
open-telemetry#34180.

The full version of the code can be found in the
cedwards/datadog-metrics-receiver-full branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

Link to tracking Issue:

open-telemetry#18278

Testing:
Unit tests, as well as an end-to-end test, have been added.

---------

Signed-off-by: alexgreenbank <[email protected]>
Co-authored-by: Carrie Edwards <[email protected]>
Co-authored-by: Juraci Paixão Kröhling <[email protected]>
f7o pushed a commit to f7o/opentelemetry-collector-contrib that referenced this issue Sep 12, 2024
**Description:**
This PR adds support for translating Datadog sketches into Exponential
Histograms.

Follow up of open-telemetry#33631, open-telemetry#33957 and open-telemetry#34180.

The full version of the code can be found in the
`cedwards/datadog-metrics-receiver-full` branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

**Link to tracking Issue:** 
open-telemetry#18278 

**Testing:** 
Unit tests, as well as an end-to-end test, have been added.

---------

Signed-off-by: Federico Torres <[email protected]>
Signed-off-by: György Krajcsovits <[email protected]>
Co-authored-by: Federico Torres <[email protected]>
Co-authored-by: György Krajcsovits <[email protected]>
jpkrohling pushed a commit that referenced this issue Oct 2, 2024
**Description:** <Describe what has changed.>
This PR updates the stability level of metrics support in the Datadog
receiver to alpha.

**Link to tracking Issue:** <Issue number if applicable>
#18278

**Testing:** <Describe what testing was performed and which tests were
added.>
N/A

**Documentation:** <Describe the documentation added.>
README updated
jriguera pushed a commit to springernature/opentelemetry-collector-contrib that referenced this issue Oct 4, 2024
…etry#34474)

Description:
This PR adds support for Datadog Service Checks.

Follow up of
open-telemetry#33631
,
open-telemetry#33957
and
open-telemetry#34180.

The full version of the code can be found in the
cedwards/datadog-metrics-receiver-full branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

Link to tracking Issue:

open-telemetry#18278

Testing:
Unit tests, as well as an end-to-end test, have been added.

---------

Signed-off-by: alexgreenbank <[email protected]>
Co-authored-by: Carrie Edwards <[email protected]>
Co-authored-by: Juraci Paixão Kröhling <[email protected]>
jriguera pushed a commit to springernature/opentelemetry-collector-contrib that referenced this issue Oct 4, 2024
**Description:**
This PR adds support for translating Datadog sketches into Exponential
Histograms.

Follow up of open-telemetry#33631, open-telemetry#33957 and open-telemetry#34180.

The full version of the code can be found in the
`cedwards/datadog-metrics-receiver-full` branch, or in Grafana Alloy:
https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

**Link to tracking Issue:** 
open-telemetry#18278 

**Testing:** 
Unit tests, as well as an end-to-end test, have been added.

---------

Signed-off-by: Federico Torres <[email protected]>
Signed-off-by: György Krajcsovits <[email protected]>
Co-authored-by: Federico Torres <[email protected]>
Co-authored-by: György Krajcsovits <[email protected]>
jriguera pushed a commit to springernature/opentelemetry-collector-contrib that referenced this issue Oct 4, 2024
…metry#35536)

**Description:** <Describe what has changed.>
This PR updates the stability level of metrics support in the Datadog
receiver to alpha.

**Link to tracking Issue:** <Issue number if applicable>
open-telemetry#18278

**Testing:** <Describe what testing was performed and which tests were
added.>
N/A

**Documentation:** <Describe the documentation added.>
README updated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants