Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new JVM runtime environment metrics #3352

Closed
64 changes: 63 additions & 1 deletion semantic_conventions/metrics/process-runtime-jvm-metrics.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
groups:
- id: attributes.process.runtime.jvm.memory
type: attribute_group
brief: "Describes JVM memory metric attributes."
brief: "Describes JVM memory metric attributes. "
attributes:
- id: type
type:
Expand All @@ -25,6 +25,18 @@ groups:
Pool names are generally obtained via
[MemoryPoolMXBean#getName()](https://docs.oracle.com/en/java/javase/11/docs/api/java.management/java/lang/management/MemoryPoolMXBean.html#getName()).

- id: attributes.process.runtime.jvm.network
type: attribute_group
brief: "Describes JVM network IO metric attributes."
attributes:
- ref: thread.id
requirement_level: opt_in
- id: network.direction
type: string
requirement_level: recommended
brief: Read or write.
examples: [ "read", "write" ]

- id: metric.process.runtime.jvm.memory.usage
type: metric
metric_name: process.runtime.jvm.memory.usage
Expand Down Expand Up @@ -183,3 +195,53 @@ groups:
brief: "Number of buffers in the pool."
instrument: updowncounter
unit: "{buffer}"

- id: metric.process.runtime.jvm.cpu.monitor.time
type: metric
metric_name: process.runtime.jvm.cpu.monitor.time
brief: "Time monitor was used bya thread. Only available in JDK 17+."
instrument: histogram
unit: "s"
attributes:
- ref: thread.id
requirement_level: opt_in
- id: class
type: string
requirement_level: opt_in
brief: Class of the monitor.
examples: [ "java.lang.Object" ]
- id: state
type: string
requirement_level: recommended
brief: Action taken at monitor.
examples: [ "blocked", "wait" ]

- id: metric.process.runtime.jvm.cpu.context_switch
type: metric
metric_name: process.runtime.jvm.context_switches
brief: "Number of context switches per second. Only available in JDK 17+."
instrument: updowncounter
unit: "Hz"

- id: metric.process.runtime.jvm.memory.allocation
type: metric
metric_name: process.runtime.jvm.memory.allocation
brief: "Size of object allocated by thread. Only available in JDK 17+."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a little bit different. ThreadMXbean returns the cumulative allocation per thread, while the JFR event ObjectAllocationSample describes a single allocation instance (sampled to reduce overhead. Sampling only happens on the TLAB slow path). But now that I think about it, it might be more useful to know the total allocation per thread rather than have statistical data on allocation sizes per thread. Additionally, the statistical data would be skewed because sampling is only done on the slow path when a new TLAB is required, or allocations won't fit into a TLAB (this is because the events purpose is to show where the allocations are happening, not how big they are).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think(?) this could be implemented in Java 8 using https://docs.oracle.com/javase/8/docs/jre/api/management/extension/com/sun/management/ThreadMXBean.html#getThreadAllocatedBytes-long:A-

That would be cool.

the JFR event ObjectAllocationSample describes a single allocation instance (sampled to reduce overhead. Sampling only happens on the TLAB slow path).

If we continue to report this in JFR, we'll want to somehow communicate to users that thee allocations are sampled.

this is because the events purpose is to show where the allocations are happening, not how big they are

Presumably for building out a profile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably for building out a profile?

Yup, you can generate flame graphs from the stack traces and other useful things like that.

If we continue to report this in JFR

I think that we should not report allocations with JFR because the purpose of those events is actually a little different than what we want to use them for. Also, the current implementation (jdk.ObjectAllocationInNewTLAB and jdk.ObjectAllocationOutsideTLAB) would result in too high an overhead for people to use in production. Those events are turned off by default in both monitoring and profiling JFR configurations. This is because they aren't throttled like jdk.ObjectAllocationSample is.

instrument: histogram
unit: "By"

- id: metric.process.runtime.jvm.network.io
type: metric
metric_name: process.runtime.jvm.network.io
brief: "Bytes read/written by thread. Only available in JDK 17+."
extends: attributes.process.runtime.jvm.network
instrument: histogram
unit: "By"

- id: metric.process.runtime.jvm.network.time
type: metric
metric_name: process.runtime.jvm.network.time
brief: "Duration of network IO operation by thread. Only available in JDK 17+."
extends: attributes.process.runtime.jvm.network
instrument: histogram
unit: "s"
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,12 @@ semantic conventions when instrumenting runtime environments.
* [Metric: `process.runtime.jvm.buffer.usage`](#metric-processruntimejvmbufferusage)
* [Metric: `process.runtime.jvm.buffer.limit`](#metric-processruntimejvmbufferlimit)
* [Metric: `process.runtime.jvm.buffer.count`](#metric-processruntimejvmbuffercount)
* [Metric: `process.runtime.jvm.cpu.monitor.wait`](#metric-processruntimejvmcpumonitorwait)
* [Metric: `process.runtime.jvm.cpu.monitor.blocked`](#metric-processruntimejvmcpumonitorblocked)
* [Metric: `process.runtime.jvm.cpu.context_switch`](#metric-processruntimejvmcpucontext_swtich)
* [Metric: `process.runtime.jvm.memory.allocation`](#metric-processruntimejvmmemoryallocation)
* [Metric: `process.runtime.jvm.network.io`](#metric-processruntimejvmnetworkio)
* [Metric: `process.runtime.jvm.network.time`](#metric-processruntimejvmnetworktime)

<!-- tocstop -->

Expand Down Expand Up @@ -377,3 +383,78 @@ This metric is [recommended](../metric-requirement-level.md#recommended).

**[1]:** Pool names are generally obtained via [BufferPoolMXBean#getName()](https://docs.oracle.com/en/java/javase/11/docs/api/java.management/java/lang/management/BufferPoolMXBean.html#getName()).
<!-- endsemconv -->


### Metric: `process.runtime.jvm.cpu.monitor.time`

This metric is [recommended](../metric-requirement-level.md#recommended). Only available with JDK 17+.

<!-- semconv metric.process.runtime.jvm.cpu.monitor.time(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.runtime.jvm.cpu.monitor.time` | Histogram | `s` | Time monitor was used bya thread. Only available in JDK 17+. |
<!-- endsemconv -->

<!-- semconv metric.process.runtime.jvm.cpu.monitor.time(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `class` | string | Class of the monitor. | `java.lang.Object` | Opt-In |
| `state` | string | Action taken at monitor. | `blocked`; `wait` | Recommended |
| [`thread.id`](../../trace/semantic_conventions/span-general.md) | int | Current "managed" thread ID (as opposed to OS thread ID). | `42` | Opt-In |
<!-- endsemconv -->

### Metric: `process.runtime.jvm.cpu.context_swtich`

This metric is [recommended](../metric-requirement-level.md#recommended). Only available with JDK 17+.

<!-- semconv metric.process.runtime.jvm.cpu.context_switch(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.runtime.jvm.context_switches` | UpDownCounter | `Hz` | Number of context switches per second. Only available in JDK 17+. |
<!-- endsemconv -->



### Metric: `process.runtime.jvm.cpu.allocation`

This metric is [recommended](../metric-requirement-level.md#recommended). Only available with JDK 17+.

<!-- semconv metric.process.runtime.jvm.memory.allocation(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.runtime.jvm.memory.allocation` | Histogram | `By` | Size of object allocated by thread. Only available in JDK 17+. |
<!-- endsemconv -->

### Metric: `process.runtime.jvm.network.io`

This metric is [recommended](../metric-requirement-level.md#recommended). Only available with JDK 17+.

<!-- semconv metric.process.runtime.jvm.network.io(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.runtime.jvm.network.io` | Histogram | `By` | Bytes read/written by thread. Only available in JDK 17+. |
<!-- endsemconv -->

<!-- semconv metric.process.runtime.jvm.network.io(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `network.direction` | string | Read or write. | `read`; `write` | Recommended |
| [`thread.id`](../../trace/semantic_conventions/span-general.md) | int | Current "managed" thread ID (as opposed to OS thread ID). | `42` | Opt-In |
<!-- endsemconv -->

### Metric: `process.runtime.jvm.network.time`

This metric is [recommended](../metric-requirement-level.md#recommended). Only available with JDK 17+.

<!-- semconv metric.process.runtime.jvm.network.time(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.runtime.jvm.network.time` | Histogram | `s` | Duration of network IO operation by thread. Only available in JDK 17+. |
<!-- endsemconv -->

<!-- semconv metric.process.runtime.jvm.network.time(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `network.direction` | string | Read or write. | `read`; `write` | Recommended |
| [`thread.id`](../../trace/semantic_conventions/span-general.md) | int | Current "managed" thread ID (as opposed to OS thread ID). | `42` | Opt-In |
<!-- endsemconv -->