Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model: introduce separate types for JSON encoding #35

Merged
merged 3 commits into from
May 11, 2023
Merged

Conversation

axw
Copy link
Member

@axw axw commented May 11, 2023

Introduce an internal package with types for encoding to Elasticsearch document JSON structure, remove mapStr conversion. The conversion between these types is manual, just like the old "fields" methods that are now removed.

There is a structural change to the metricset type. Rather than adding fields directly to the top of the output, we now nest metrics under metricset.samples. This will need an accompanying change to the metric data stream ingest pipelines.

The JSON encoding methods are generated by fastjson. I've added a simple benchmark, and compared against main:

goos: linux
goarch: amd64
pkg: github.com/elastic/apm-data/model
cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics
                   │ /tmp/old.txt │            /tmp/new.txt            │
                   │    sec/op    │   sec/op     vs base               │
MarshalFastJSON-16   20.396µ ± 4%   6.278µ ± 3%  -69.22% (p=0.002 n=6)

                   │ /tmp/old.txt  │            /tmp/new.txt
│
                   │     B/op      │     B/op      vs base
│
MarshalFastJSON-16   12.332Ki ± 0%   1.844Ki ± 0%  -85.05% (p=0.002 n=6)

                   │ /tmp/old.txt │           /tmp/new.txt            │
                   │  allocs/op   │ allocs/op   vs base               │
MarshalFastJSON-16    116.00 ± 0%   31.00 ± 0%  -73.28% (p=0.002 n=6)
``

@axw axw force-pushed the modeljson branch 4 times, most recently from df0ce79 to bd3181f Compare May 11, 2023 05:03
Introduce an internal package with types for encoding
to Elasticsearch document JSON structure, remove mapStr
conversion. The conversion between these types is manual,
just like the old "fields" methods that are now removed.

The JSON encoding methods are generated by fastjson.
I've added a simple benchmark, and compared against main:

```
goos: linux
goarch: amd64
pkg: github.com/elastic/apm-data/model
cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics
                   │ /tmp/old.txt │            /tmp/new.txt            │
                   │    sec/op    │   sec/op     vs base               │
MarshalFastJSON-16   20.396µ ± 4%   6.278µ ± 3%  -69.22% (p=0.002 n=6)

                   │ /tmp/old.txt  │            /tmp/new.txt
│
                   │     B/op      │     B/op      vs base
│
MarshalFastJSON-16   12.332Ki ± 0%   1.844Ki ± 0%  -85.05% (p=0.002 n=6)

                   │ /tmp/old.txt │           /tmp/new.txt            │
                   │  allocs/op   │ allocs/op   vs base               │
MarshalFastJSON-16    116.00 ± 0%   31.00 ± 0%  -73.28% (p=0.002 n=6)
``
@axw axw marked this pull request as ready for review May 11, 2023 06:15
@axw axw requested a review from a team May 11, 2023 06:15
axw added 2 commits May 11, 2023 14:43
Revert to flattened data_stream.* fields, and defer
stringification of netip.Addr to the JSON encoding,
where we can append to the encoded JSON directly.
Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this approach looks reasonable, just left some minor comments

@simitt
Copy link
Contributor

simitt commented May 11, 2023

There is a structural change to the metricset type. Rather than adding fields directly to the top of the output, we now nest metrics under metricset.samples. This will need an accompanying change to the metric data stream ingest pipelines.

Do I understand this correct that you are planning on adding an ingest pipeline to the apm package, which ensures that the metrics are set to top level again and not cause any changes in the ES data streams?

@axw
Copy link
Member Author

axw commented May 11, 2023

Do I understand this correct that you are planning on adding an ingest pipeline to the apm package, which ensures that the metrics are set to top level again and not cause any changes in the ES data streams?

Correct, already done in elastic/apm-server#10792

@axw
Copy link
Member Author

axw commented May 11, 2023

Though we might want to merge the apmpackage change before updating apm-data, and wait for it to get picked up by Kibana to avoid breaking things temporarily.

Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM

@axw
Copy link
Member Author

axw commented May 11, 2023

I'll merge this and then split the apmpackage changes out of the apm-server PR.

@axw axw merged commit caaa44e into elastic:main May 11, 2023
@axw axw deleted the modeljson branch May 11, 2023 09:55
SylvainJuge pushed a commit to SylvainJuge/apm-data that referenced this pull request Jun 2, 2023
* model: introduce separate types for JSON encoding

Introduce an internal package with types for encoding
to Elasticsearch document JSON structure, remove mapStr
conversion. The conversion between these types is manual,
just like the old "fields" methods that are now removed.

The JSON encoding methods are generated by fastjson.
I've added a simple benchmark, and compared against main:

```
goos: linux
goarch: amd64
pkg: github.com/elastic/apm-data/model
cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics
                   │ /tmp/old.txt │            /tmp/new.txt            │
                   │    sec/op    │   sec/op     vs base               │
MarshalFastJSON-16   20.396µ ± 4%   6.278µ ± 3%  -69.22% (p=0.002 n=6)

                   │ /tmp/old.txt  │            /tmp/new.txt
│
                   │     B/op      │     B/op      vs base
│
MarshalFastJSON-16   12.332Ki ± 0%   1.844Ki ± 0%  -85.05% (p=0.002 n=6)

                   │ /tmp/old.txt │           /tmp/new.txt            │
                   │  allocs/op   │ allocs/op   vs base               │
MarshalFastJSON-16    116.00 ± 0%   31.00 ± 0%  -73.28% (p=0.002 n=6)
``

* Revert to flattened data_stream.*; use netip.Addr

Revert to flattened data_stream.* fields, and defer
stringification of netip.Addr to the JSON encoding,
where we can append to the encoded JSON directly.

* Fix field alignment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants