Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/testrunner/runners/pipeline: output documents with fields in a normalized order #644

Merged
merged 1 commit into from
Jan 13, 2022

Conversation

efd6
Copy link
Contributor

@efd6 efd6 commented Jan 13, 2022

The unstable order of fields output by ES commonly results in unrelated diff noise in changes to packages. This change normalises the order of fields lexically to reduce that problem over time. There will be an initial increase in diff noise as each package is brought into the canonical order.

For example with the sophos xg test set.

Before:

{
    "expected": [
        {
            "server": {
                "port": 0,
                "bytes": 0
            },
            "log": {
                "level": "informational"
            },
            "destination": {
                "port": 0,
                "user": {
                    "email": "[email protected]"
                },
                "bytes": 0
            },
            "source": {
                "port": 0,
                "user": {
                    "email": "[email protected]"
                },
                "bytes": 0,
                "domain": "elasticuser.com"
            },
            "tags": [
                "preserve_original_event"
            ],
            "network": {
                "transport": "TCP"
            },
            "observer": {
                "product": "XG",
                "serial_number": "1234567890123456",
                "type": "firewall",
                "vendor": "Sophos"
            },
            "@timestamp": "2020-05-18T14:38:48.000Z",
            "ecs": {
                "version": "1.12.0"
            },
            "related": {
                "hosts": [
                    "testhost.local"
                ]
            },
            "sophos": {
                "xg": {
                    "fw_rule_id": "0",
                    ...

After:

{
    "expected": [
        {
            "@timestamp": "2020-05-18T14:38:48.000Z",
            "client": {
                "bytes": 0,
                "port": 0
            },
            "destination": {
                "bytes": 0,
                "port": 0,
                "user": {
                    "email": "[email protected]"
                }
            },
            "ecs": {
                "version": "1.12.0"
            },
            "event": {
                "action": "Allowed",
                "category": [
                    "network"
                ],
                "code": "041101618035",
                "ingested": "2022-01-13T06:37:18.072532500Z",
                "kind": "event",
                "original": "device=\"SFW\" date=2020-05-18 time=14:38:48 timezone=\"CEST\" device_name=\"XG230\" device_id=1234567890123456 log_id=041101618035 log_type=\"Anti-Spam\" log_component=\"SMTP\" log_subtype=\"Allowed\" status=\"\" priority=Information fw_rule_id=0 user_name=\"\" av_policy_name=\"None\" from_email_address=\"[email protected]\" to_email_address=\"[email protected]\" email_subject=\"*ALERT* Sophos XG Firewall\" mailid=\"qkW2Y6-LxBk6U-vH-1590055245\" mailsize=19728 spamaction=\"QUEUED\" reason=\"Email has been accepted by Device and queued for scanning.\" src_domainname=\"elasticuser.com\" dst_domainname=\"\" src_ip=\"\" src_country_code=\"\" dst_ip=\"\" dst_country_code=\"\" protocol=\"TCP\" src_port=0 dst_port=0 sent_bytes=0 recv_bytes=0 quarantine_reason=\"Other\"",
                "outcome": "success",
                "severity": 6,
                "type": [
                    "allowed",
                    "connection"
                ]
            },
            "host": {
                "name": "testhost.local"
            },
            "log": {
                "level": "informational"
            },
            "network": {
                "transport": "TCP"
            },
            "observer": {
                "product": "XG",
                "serial_number": "1234567890123456",
                "type": "firewall",
                "vendor": "Sophos"
            },
            ...

Please take a look.

@efd6 efd6 added enhancement New feature or request Team:Elastic-Agent Label for the Agent team Team:Security-External Integrations labels Jan 13, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 13, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-01-13T09:23:21.473+0000

  • Duration: 27 min 42 sec

  • Commit: 06955fa

Test stats 🧪

Test Results
Failed 0
Passed 459
Skipped 0
Total 459

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@mtojek mtojek requested a review from a team January 13, 2022 07:39
if err != nil {
return nil, errors.Wrap(err, "marshalling test result definition failed")
}
return body, nil
}

// normalizeFields ensures that field order remains consistent independent
// of field order returned by ES to minimize diff noise during changes.
func normalizeFields(msg []byte, err error) ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit tangled that marshalTestResultDefinition requires to call json.Marshal before and then it calls few other json.* operations. What do you think about normalizing testResultDefinition instead?

Copy link
Contributor Author

@efd6 efd6 Jan 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at that; making the original unmarshal operation into a []interface{} in testResultDefinition which would have that effect, but there is a significant amount of code that depends on the presence of []byte JSON during the tests. I'm not wildly happy with this approach, but it is the least invasive; if it were in a section that was hotter, I think it would likely be worth looking at disentangling, but that's not the case.

It's worth noting that the initial json.Marshal call does very little work, essentially just pasting the slice of messages into {"expected":[ and } with comma separators. The only real work is done by the unmarshal/remarshal; this is essentially the same as constructing an AST for the JSON and then formatting canonically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the complexity now. Maybe just put json.Marshal inside normalizeFields and rename it to marshalNormalized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@efd6 efd6 force-pushed the normalisedfields branch from 45942b1 to 06955fa Compare January 13, 2022 09:22
@mtojek mtojek merged commit 8e6f142 into elastic:master Jan 13, 2022
@efd6 efd6 deleted the normalisedfields branch January 13, 2022 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants