Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make elasticsearch/ml_job metricset work for Stack Monitoring without xpack.enabled flag #21043

Conversation

sayden
Copy link
Contributor

@sayden sayden commented Sep 9, 2020

WIP

@sayden sayden added Metricbeat Metricbeat Feature:Stack Monitoring Team:Services (Deprecated) Label for the former Integrations-Services team labels Sep 9, 2020
@sayden sayden self-assigned this Sep 9, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/stack-monitoring (Stack monitoring)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Sep 9, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Sep 9, 2020

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #21043 updated

  • Start Time: 2020-12-14T17:44:45.342+0000

  • Duration: 54 min 34 sec

Test stats 🧪

Test Results
Failed 0
Passed 2312
Skipped 529
Total 2841

Steps errors 2

Expand to view the steps failures

Terraform Apply on x-pack/metricbeat/module/aws
  • Took 0 min 15 sec . View more details on here
Terraform Apply on x-pack/metricbeat/module/aws
  • Took 0 min 15 sec . View more details on here

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 2312
Skipped 529
Total 2841

@chrisronline
Copy link
Contributor

When running this, I am seeing this error:

2020-09-16T14:14:27.128-0400	WARN	[elasticsearch]	elasticsearch/client.go:407	Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xbfd0b36081e8ff98, ext:80088391016, loc:(*time.Location)(0x840ef00)}, Meta:null, Fields:{"agent":{"ephemeral_id":"2be51d36-81dc-4a75-ad07-8fc4d3e28984","id":"5ca7fc45-dad9-4ed5-afa0-ff17fd83a2e1","name":"cbr-mbp.lan","type":"metricbeat","version":"8.0.0"},"ecs":{"version":"1.5.0"},"elasticsearch":{"cluster":{"id":"PChw572_S52efCun1M0eNg","name":"elasticsearch"},"ml":{"job":{"data_counts":{"bucket":{"count":1,"empty":{"count":0},"sparse":{"count":0}},"field":{"processed":{"count":2}},"input":{"bytes":83,"field":{"count":2}},"invalid_date":{"count":0},"last_data_time":1600180598576,"missing_field":{"count":0},"out_of_order":{"timestamp":{"count":0}},"record":{"earliest":{"ms":1599586175000},"input":{"count":2},"latest":{"ms":1599586775000},"processed":{"count":2}}},"id":"test","model_size":{"bucket_allocation_failures":{"count":0},"log_time":{"ms":1600180598860},"model":{"bytes":160412},"result_type":"model_size_stats","timestamp":{"ms":1599585300000},"total":{"field":{"by":{"count":3},"over":{"count":0},"partition":{"count":2}}}},"state":"closed"}}},"event":{"dataset":"elasticsearch.ml.job","duration":3117145,"module":"elasticsearch"},"host":{"architecture":"x86_64","hostname":"cbr-mbp.lan","id":"0E799C52-1A23-58FB-8293-749EDE93385A","ip":["fe80::aede:48ff:fe00:1122","fe80::b7f4:f2fd:ae7d:1173","fe80::931b:e13c:3a6b:d6bf","fe80::2838:c952:b649:8a60","fe80::17bf:f7b:6891:c7b2","fe80::47fd:cdd0:5199:d5b8","fe80::4556:e4ec:5cdc:803a","fe80::8697:d998:409e:f8","fe80::1ca6:6d6d:8bc8:f59f","fe80::180b:f101:eb29:6312","192.168.86.227","fe80::99da:f30e:209a:647e","fe80::4306:9d18:5860:23ca"],"mac":["ac:de:48:00:11:22","3a:f9:d3:be:e5:4f","38:f9:d3:be:e5:4f","82:41:7e:c3:9c:00","82:41:7e:c3:9c:01","82:41:7e:c3:9c:05","82:41:7e:c3:9c:04","82:41:7e:c3:9c:01","0a:f9:d3:be:e5:4f","5e:d8:74:3d:04:03","5e:d8:74:3d:04:03","68:5b:35:cc:cb:1e"],"name":"cbr-mbp.lan","os":{"build":"19G2021","family":"darwin","kernel":"19.6.0","name":"Mac OS X","platform":"darwin","version":"10.15.6"}},"metricset":{"name":"ml_job","period":10000},"service":{"address":"localhost:9200","name":"elasticsearch","type":"elasticsearch"}}, Private:interface {}(nil), TimeSeries:true}, Flags:0x0, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse field [elasticsearch.ml.job.model_size.result_type] of type [long] in document with id 'zwUfmHQBX6NiqvKh-8jd'. Preview of field's value: 'model_size_stats'","caused_by":{"type":"illegal_argument_exception","reason":"For input string: \"model_size_stats\""}}

@chrisronline
Copy link
Contributor

This is now running successfully, thanks @sayden!

I'm seeing an issue though - the field in .monitoring-es-* indices is job_stats.job_id but the path in metricbeat-* is job.stats.job_id

@sayden
Copy link
Contributor Author

sayden commented Oct 22, 2020

Ok! Fixed it! Sorry, it's easy that I can miss one of those fields, sometimes they all look the same 😄

@chrisronline
Copy link
Contributor

I found something else here too.

POST metricbeat-*/_search?filter_path=aggregations.types.buckets
{
  "size": 0,
  "aggs": {
    "types": {
      "terms": {
        "field": "metricset.name",
        "size": 10
      },
      "aggs": {
        "top": {
          "top_hits": {
            "size": 1,
            "sort": [
              {
                "@timestamp": "desc"
              }
            ],
            "_source": "@timestamp"
          }
        }
      }
    }
  }
}

->

{
  "aggregations" : {
    "types" : {
      "buckets" : [
        {
          "key" : "node_stats",
          "doc_count" : 120,
          "top" : {
            "hits" : {
              "total" : {
                "value" : 120,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "metricbeat-8.0.0-2020.10.23-000001",
                  "_id" : "QHuSVnUBHO_lL4jX4_2T",
                  "_score" : null,
                  "_source" : {
                    "@timestamp" : "2020-10-23T17:47:47.505Z"
                  },
                  "sort" : [
                    1603475267505
                  ]
                }
              ]
            }
          }
        },
        {
          "key" : "ml_job",
          "doc_count" : 51,
          "top" : {
            "hits" : {
              "total" : {
                "value" : 51,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "metricbeat-8.0.0-2020.10.23-000001",
                  "_id" : "PnuSVnUBHO_lL4jX4_2T",
                  "_score" : null,
                  "_source" : {
                    "@timestamp" : "2020-10-23T17:47:47.500Z"
                  },
                  "sort" : [
                    1603475267500
                  ]
                }
              ]
            }
          }
        }
      ]
    }
  }
}

The metricset.name (and type) is ml_job but the query we run in the UI is looking for job_stats as the type:

{
  "index": ".monitoring-es-6-*,.monitoring-es-7-*,metricbeat-*",
  "size": 10000,
  "ignoreUnavailable": true,
  "filterPath": [
    "hits.hits._source.job_stats.job_id",
    "hits.hits._source.job_stats.state",
    "hits.hits._source.job_stats.data_counts.processed_record_count",
    "hits.hits._source.job_stats.model_size_stats.model_bytes",
    "hits.hits._source.job_stats.forecasts_stats.total",
    "hits.hits._source.job_stats.node.id",
    "hits.hits._source.job_stats.node.name"
  ],
  "body": {
    "sort": { "timestamp": { "order": "desc", "unmapped_type": "long" } },
    "collapse": { "field": "job_stats.job_id" },
    "query": {
      "bool": {
        "filter": [
          {
            "bool": {
              "should": [
                { "term": { "type": "job_stats" } },
                { "term": { "metricset.name": "job_stats" } }
              ]
            }
          },
          { "term": { "cluster_uuid": "KJYGWSvQQmqmG9f_X8TXRg" } },
          {
            "range": {
              "timestamp": {
                "format": "epoch_millis",
                "gte": 1603471642570,
                "lte": 1603475242570
              }
            }
          }
        ]
      }
    }
  }
}

It looks like the job_stats is still coming through in the .monitoring-es-* indices in this PR:

POST .monitoring-es-*/_search?filter_path=aggregations.types.buckets
{
  "size": 0,
  "aggs": {
    "types": {
      "terms": {
        "field": "type",
        "size": 10
      }
    }
  }
}

->

{
  "aggregations" : {
    "types" : {
      "buckets" : [
        {
          "key" : "index_stats",
          "doc_count" : 107105
        },
        {
          "key" : "cluster_stats",
          "doc_count" : 9116
        },
        {
          "key" : "enrich_coordinator_stats",
          "doc_count" : 9116
        },
        {
          "key" : "index_recovery",
          "doc_count" : 9116
        },
        {
          "key" : "indices_stats",
          "doc_count" : 9116
        },
        {
          "key" : "node_stats",
          "doc_count" : 8986
        },
        {
          "key" : "ccr_auto_follow_stats",
          "doc_count" : 8933
        },
        {
          "key" : "shards",
          "doc_count" : 1276
        },
        {
          "key" : "job_stats",
          "doc_count" : 62
        }
      ]
    }
  }
}

…csearch/ml-xpack_flag

# Conflicts:
#	metricbeat/docs/fields.asciidoc
#	metricbeat/module/elasticsearch/_meta/fields.yml
#	metricbeat/module/elasticsearch/fields.go
…csearch/ml-xpack_flag

# Conflicts:
#	metricbeat/module/elasticsearch/fields.go
@sayden sayden requested a review from ycombinator December 9, 2020 15:15
Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment about a couple of breaking changes and a question about TODOs in field documentation.

"id": "3LbUkLkURz--FR-YO0wLNA",
"name": "es1"
"id": "8l_zoGznQRmtoX9iSC-goA",
"name": "docker-cluster"
},
"ml": {
"job": {
"data_counts": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that is confusing me is that in this data.json here, the field path is elasticsearch.ml.job.data_counts.*. But if I look at fields.yml, I'm seeing elasticsearch.ml.job.data.counts.*. Shouldn't the two match up or, failing that, shouldn't there be aliases from all the elasticsearch.ml.job.data_counts.* fields to the corresponding elasticsearch.ml.job.data.counts.* fields so there are breaking changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I want a pair of eyes like yours to spot such things 😄

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@sayden sayden merged commit 19f2f7b into elastic:feature-stack-monitoring-mb-ecs Dec 15, 2020
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Stack Monitoring Metricbeat Metricbeat Team:Services (Deprecated) Label for the former Integrations-Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants