Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apm-data: Use representative count as event.success_count if available #119995

Merged
merged 12 commits into from
Feb 20, 2025

Conversation

rubvs
Copy link
Contributor

@rubvs rubvs commented Jan 10, 2025

Closes elastic/apm-data#388

The pipeline changes copies the representative count calculated in APM Data (see Code), into the event.success_count metric. The default representative count is 1.

If an event, which can only be a transaction or span in this context, failed for some reason, the event.success_count metric is set to zero. To my understanding, we are not capturing any data related to failed events. Nor do I see any usage of something like an event.failure_count metric. Please correct me if I'm wrong.

Along with the unit test provided, the pipeline changes was also test manually in Kibana Console:

Test

  1. Create a cloud deployment with any version.
  2. In Dev Tools paste the below pipeline simulation and run.
  3. Verify that the representative_count value is the same as that of event.success_count.
POST _ingest/pipeline/_simulate
POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "Test APM pipeline",
    "processors": [
      {
        "set": {
          "if": "ctx.transaction?.type != null",
          "field": "processor.event",
          "value": "transaction"
        }
      },
      {
        "set": {
          "field": "processor.event",
          "value": "span",
          "override": false
        }
      },
      {
        "script": {
          "if": "ctx[ctx.processor.event]?.duration == null",
          "source": """
            def eventDuration = ctx.event?.duration ?: 0;
            def rootObjectName = ctx.processor?.event;
            def rootObject = ctx[rootObjectName];
            if (rootObject == null) {
              rootObject = [:];
              ctx[rootObjectName] = rootObject;
            }
            rootObject.duration = ["us": (long)(eventDuration/1000)];
          """
        }
      },
      {
        "remove": {
          "field": ["event.duration"],
          "ignore_failure": true,
          "ignore_missing": true
        }
      },
      {
        "set": {
          "if": "ctx.event?.outcome == 'failure'",
          "field": "event.success_count",
          "value": 0
        }
      },
      {
        "set": {
          "if": "ctx.event?.outcome == 'success'",
          "field": "event.success_count",
          "value": 1
        }
      },
      {
        "script": {
          "if": "ctx.event?.outcome == 'success' && ctx[ctx.processor?.event]?.representative_count != null",
          "source": "ctx.event.success_count = ctx[ctx.processor?.event].representative_count;"
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "processor": {
          "event": "transaction"
        },
        "transaction": {
          "representative_count": 2,
          "type": "request",
          "duration": {
            "us": 1234567
          }
        },
        "event": {
          "duration": 1234567000,
          "outcome": "success"
        }
      }
    },
    {
      "_source": {
        "processor": {
          "event": "span"
        },
        "span": {
          "representative_count": 3,
          "type": "request",
          "duration": {
            "us": 1234567
          }
        },
        "event": {
          "duration": 1234567000,
          "outcome": "success"
        }
      }
    },
    {
      "_source": {
        "processor": {
            "event": "transaction"
        },
        "transaction": {
          "representative_count": 2,
          "type": "request",
          "duration": {
            "us": 1234567
          }
        },
        "event": {
          "duration": 1234567000,
          "outcome": "failure"
        }
      }
    },
    {
      "_source": {
        "processor": {
          "event": "span"
        },
        "span": {
          "representative_count": 3,
          "type": "request",
          "duration": {
            "us": 1234567
          }
        },
        "event": {
          "duration": 1234567000,
          "outcome": "failure"
        }
      }
    },
    {
      "_source": {
        "processor": {
          "event": "span"
        },
        "span": {
          "representative_count": null,
          "type": "request",
          "duration": {
            "us": 1234567
          }
        },
        "event": {
          "duration": 1234567000,
          "outcome": "success"
        }
      }
    },
    {
      "_source": {
        "processor": {
          "event": "transaction"
        },
        "span": {
          "type": "request",
          "duration": {
            "us": 1234567
          }
        },
        "event": {
          "duration": 1234567000,
          "outcome": "success"
        }
      }
    }
  ]
}

@joegallo joegallo added >enhancement Team:Data Management Meta label for data/management team external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jan 10, 2025
@rubvs rubvs requested review from lahsivjar and carsonip January 10, 2025 22:33
@rubvs rubvs marked this pull request as ready for review February 19, 2025 16:14
@rubvs rubvs requested a review from a team as a code owner February 19, 2025 16:14
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Data Management Meta label for data/management team labels Feb 19, 2025
@rubvs rubvs added the Team:obs-ds-intake-services Meta label for Observability Intake Services team label Feb 19, 2025
@elasticsearchmachine elasticsearchmachine removed the Team:obs-ds-intake-services Meta label for Observability Intake Services team label Feb 19, 2025
@rubvs rubvs requested a review from carsonip February 19, 2025 16:16
@rubvs rubvs added the Team:Data Management Meta label for data/management team label Feb 19, 2025
@elasticsearchmachine elasticsearchmachine removed the Team:Data Management Meta label for data/management team label Feb 19, 2025
@rubvs rubvs requested a review from joegallo February 19, 2025 16:42
@joegallo joegallo added Team:Data Management Meta label for data/management team and removed needs:triage Requires assignment of a team area label labels Feb 19, 2025
@carsonip carsonip self-requested a review February 20, 2025 12:46
@rubvs rubvs changed the title Add representative count to apm-data plugin apm-data: Use representative count as event.success_count if available Feb 20, 2025
@rubvs rubvs added >bug and removed >enhancement labels Feb 20, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @rubvs, I've updated the changelog YAML for you.

Copy link
Member

@carsonip carsonip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for covering the case I missed and also adding the corresponding test case

Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the changelog is fixed

@rubvs rubvs force-pushed the fix-representative-count branch from d1cd1c6 to 3af38c3 Compare February 20, 2025 17:52
@rubvs rubvs merged commit 171a3b9 into main Feb 20, 2025
18 checks passed
@rubvs rubvs deleted the fix-representative-count branch February 20, 2025 19:45
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 119995

@dakrone
Copy link
Member

dakrone commented Feb 20, 2025

This PR doesn't appear to have bumped the version. Was the version bumped elsewhere so that it will still be installed, or do we need a separate PR so that the version is incremented and the new pipeline installed?

@rubvs
Copy link
Contributor Author

rubvs commented Feb 20, 2025

@dakrone I believe we will need to do this manually with a new PR. But I'm not sure, this is my first merge into ES. How can I check if it was bumped elsewhere?

@dakrone
Copy link
Member

dakrone commented Feb 20, 2025

You can check the git blame for this line:

# "version" holds the version of the templates and ingest pipelines installed
# by xpack-plugin apm-data. This must be increased whenever an existing template or
# pipeline is changed, in order for it to be updated on Elasticsearch upgrade.
version: 12

(which is also the value you would need to increment).

It appears it hasn't been bumped since November of last year, so incrementing it would be necessary (just change it from 12 -> 13)

@rubvs
Copy link
Contributor Author

rubvs commented Feb 20, 2025

Thanks, I'm also busy fixing the merge issue by checky-picking my commit, but I don't see a v8.19 tag in ES repo?

@dakrone
Copy link
Member

dakrone commented Feb 20, 2025

That's because v8.19 hasn't been released yet, if you're trying to backport, you should backport to the 8.x branch.

afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Feb 21, 2025
@rubvs
Copy link
Contributor Author

rubvs commented Feb 21, 2025

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Data Management Meta label for data/management team v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

event.success_count metric should use representative count
8 participants