Use Kibana time series visualizations for benchmark results

With this commit, night_rally uses Kibana for visualizing nightly and release benchmark results. We also add an administration tool e.g. for adding annotations. While we add support for these new visualizations now, we are in a transitory phase and will still populate and use the current charts for the time being. After a grace period, we will then switch to using Kibana exclusively. Relates elastic#23
cbuescher · May 16, 2017 · a49a2a0 · a49a2a0
1 parent 04b863b
commit a49a2a0
Show file tree

Hide file tree

Showing 6 changed files with 535 additions and 44 deletions.
diff --git a/README.md b/README.md
@@ -23,27 +23,32 @@ Now you can invoke night_rally regularly with the startup script `night_rally.sh
 
 #### Add an annotation
 
-To add an annotation, just find the right `*_annotation.json` file and add an annotation there. Here is an example record:
-
-```json
-{
-    "series": "GC young gen (sec)",
-    "x": "2016-08-08 06:10:01",
-    "shortText": "A",
-    "text": "Use 4GB heap instead of default"
-}
-```
+To add an annotation, use the admin tool. First find the correct trial timestamp by issuing `python3 admin.py list races --environment=nightly`. You will need the trial timestamp later. Below are examples for common cases:
+
+* Add an annotation for all charts for a specific nightly benchmark trial: `python3 admin.py add annotation --environment=nightly --trial-timestamp=20170502T220213Z --message="Just a test annotation"`
+* Add an annotation for all charts of one track for a specific nightly benchmark trial: `python3 admin.py add annotation --environment=nightly --trial-timestamp=20170502T220213Z --track=geonames --message="Just a test annotation for geonames"`
+* Add an annotation for a specific chart of one track for a specific nightly benchmark trial: `python3 admin.py add annotation --environment=nightly --trial-timestamp=20170502T220213Z --track=geonames --chart=io --message="Just a test annotation"`
+
+For more details, please issue `python3 admin.py add annotation --help`.
+
+**Note:** The admin tool also supports a dry-run mode for all commands that would change the data store. Just append `--dry-run`.
+
+**Note:** The new annotation will show up immediately. 
+
+#### Remove an annotation
 
-* The series name has to match the series name in the CSV data file on the server (if no example is in the file you want to edit, inspect the S3 bucket `elasticsearch-benchmarks.elastic.co`).
-* In `x` you specify the timestamp where an annotation should appear. The timestamp format must be identical to the one in the example.
-* `shortText` is the annotation label.
-* `text` is the explanation that will be shown in the tooltip for this annotation.
+If you have made an error you can also remove specific annotations by id.
 
-If you're finished, commit and push the change to `master` and the annotation will be shown after the next benchmark run.
+1. Issue `python3 admin.py list annotations --environment=nightly` and find the right annotation. Note that only the 20 most recent annotations are shown. You can show more, by specifying `--limit=NUMBER`. 
+2. Suppose the id of the annotation that we want to delete is `AVwM0jAA-dI09MVLDV39`. Then issue `python3 admin.py delete annotation --id=AVwM0jAA-dI09MVLDV39`.
+
+For more details, please issue `python3 admin.py delete annotation --help`.
+
+**Note:** The admin tool also supports a dry-run mode for all commands that would change the data store. Just append `--dry-run`.
 
 #### Add a new track
 
-For this three steps are needed:
+The following steps are necessary to add a new track: 
 
 1. Copy a directory in `external/pages` and adjust the names accordingly.
 2. Adjust the menu structure in all other files (if this happens more often, we should think about using a template engine for that...)
@@ -54,14 +59,13 @@ If you're finished, please submit a PR. After the PR is merged, the new track wi
 
 #### Run a release benchmark
 
-Suppose we want to replace the (already published) results of the Elasticsearch release `5.3.0` with release `5.3.1` on our benchmark page. 
+Suppose we want to publish a new release benchmark of the Elasticsearch release `5.3.1` on our benchmark page. To do that, start a new [macrobenchmark build](https://elasticsearch-ci.elastic.co/view/All/job/elastic+elasticsearch+master+macrobenchmark-periodic/) with the following parameters:
 
-1. Replace "5.3.0" with "5.3.1" in the `versions` array in each `index.html` in `external/pages`. Commit and push your changes (commit message convention: "Update comparison charts to 5.3.1")
-2. On the benchmark machine, issue the following command:
+* MODE: release
+* RELEASE: 5.3.1
+* TARGET_HOST: Just use the default value
 
-```
-night_rally.sh --target-host=target-551504.benchmark.hetzner-dc17.elasticnet.co:39200 --mode=comparison --release="5.3.1" --replace-release="5.3.0"
-```
+The results will show up automatically as soon as the build is finished
 
 #### Run an ad-hoc benchmark
 
@@ -73,5 +77,5 @@ Suppose we want to publish the results of the commit hash `66202dc` in the Elast
 2. On the benchmark machine, issue the following command:
 
 ```
-night_rally.sh --target-host=target-551504.benchmark.hetzner-dc17.elasticnet.co:39200 --mode=adhoc --revision=66202dc --release="Lucene 7" --replace-release="Lucene 7
+night_rally.sh --target-host=target-551504.benchmark.hetzner-dc17.elasticnet.co:39200 --mode=adhoc --revision=66202dc --release="Lucene 7"
 ```
diff --git a/admin.py b/admin.py
@@ -0,0 +1,289 @@
+import os
+import sys
+import argparse
+import client
+# non-standard! requires setup.py!!
+import tabulate
+
+
+def list_races(es, args):
+    limit = args.limit
+    environment = args.environment
+    track = args.track
+
+    if args.track:
+        print("Listing %d most recent races for track %s in environment %s.\n" % (limit, track, environment))
+        query = {
+            "query": {
+                "bool": {
+                    "filter": [
+                        {
+                            "term": {
+                                "environment": environment
+                            }
+                        },
+                        {
+                            "term": {
+                                "track": track
+                            }
+                        }
+                    ]
+                }
+
+            }
+        }
+    else:
+        print("Listing %d most recent races in environment %s.\n" % (limit, environment))
+        query = {
+            "query": {
+                "term": {
+                    "environment": environment
+                }
+            }
+        }
+
+    query["sort"] = [
+        {
+            "trial-timestamp": "desc"
+        },
+        {
+            "track": "asc"
+        },
+        {
+            "challenge": "asc"
+        }
+    ]
+
+    result = es.search(index="rally-races-*", body=query, size=limit)
+    races = []
+    for hit in result["hits"]["hits"]:
+        src = hit["_source"]
+        races.append([src["trial-timestamp"], src["track"], src["challenge"], src["car"],
+                      src["cluster"]["distribution-version"], src["user-tag"]])
+    if races:
+        print(tabulate.tabulate(races, headers=["Race Timestamp", "Track", "Challenge", "Car", "Version", "User Tag"]))
+    else:
+        print("No results")
+
+
+def list_annotations(es, args):
+    limit = args.limit
+    environment = args.environment
+    track = args.track
+    if track:
+        print("Listing %d most recent annotations in environment %s for track %s.\n" % (limit, environment, track))
+        query = {
+            "query": {
+                "bool": {
+                    "filter": [
+                        {
+                            "term": {
+                                "environment": environment
+                            }
+                        },
+                        {
+                            "term": {
+                                "track": track
+                            }
+                        }
+                    ]
+                }
+
+            }
+        }
+    else:
+        print("Listing %d most recent annotations in environment %s.\n" % (limit, environment))
+        query = {
+            "query": {
+                "term": {
+                    "environment": environment
+                }
+            },
+        }
+    query["sort"] = [
+        {
+            "trial-timestamp": "desc"
+        },
+        {
+            "track": "asc"
+        },
+        {
+            "chart": "asc"
+        }
+    ]
+
+    result = es.search(index="rally-annotations", body=query, size=limit)
+    annotations = []
+    for hit in result["hits"]["hits"]:
+        src = hit["_source"]
+        annotations.append([hit["_id"], src["trial-timestamp"], src.get("track", ""), src.get("chart", ""), src["message"]])
+    if annotations:
+        print(tabulate.tabulate(annotations, headers=["Annotation Id", "Timestamp", "Track", "Chart", "Message"]))
+    else:
+        print("No results")
+
+
+def add_annotation(es, args):
+    environment = args.environment
+    trial_timestamp = args.trial_timestamp
+    track = args.track
+    chart = args.chart
+    message = args.message
+    dry_run = args.dry_run
+
+    if dry_run:
+        print("Would add annotation with message [%s] for environment=[%s], trial timestamp=[%s], track=[%s], chart=[%s]" %
+              (message, environment, trial_timestamp, track, chart))
+    else:
+        if not es.indices.exists(index="rally-annotations"):
+            body = open("%s/resources/annotation-mapping.json" % os.path.dirname(os.path.realpath(__file__)), "rt").read()
+            es.indices.create(index="rally-annotations", body=body)
+        es.index(index="rally-annotations", doc_type="type", body={
+            "environment": environment,
+            "trial-timestamp": trial_timestamp,
+            "track": track,
+            "chart": chart,
+            "message": message
+        })
+
+
+def delete_annotation(es, args):
+    import elasticsearch
+    annotations = args.id.split(",")
+    if args.dry_run:
+        if len(annotations) == 1:
+            print("Would delete annotation with id [%s]." % annotations[0])
+        else:
+            print("Would delete %s annotations: %s." % (len(annotations, annotations)))
+    else:
+        for annotation_id in annotations:
+            try:
+                es.delete(index="rally-annotations", doc_type="type", id=annotation_id)
+                print("Successfully deleted [%s]." % annotation_id)
+            except elasticsearch.TransportError as e:
+                if e.status_code == 404:
+                    print("Did not find [%s]." % annotation_id)
+                else:
+                    raise
+
+
+def arg_parser():
+    parser = argparse.ArgumentParser(description="Admin tool for Elasticsearch benchmarks",
+                                     formatter_class=argparse.RawDescriptionHelpFormatter)
+
+    subparsers = parser.add_subparsers(
+        title="subcommands",
+        dest="subcommand",
+        help="")
+
+    # list races --max-results=20
+    # list annotations --max-results=20
+    list_parser = subparsers.add_parser("list", help="List configuration options")
+    list_parser.add_argument(
+        "configuration",
+        metavar="configuration",
+        help="What the admin tool should list. Possible values are: races, annotations",
+        choices=["races", "annotations"])
+
+    list_parser.add_argument(
+        "--limit",
+        help="Limit the number of search results (default: 20).",
+        default=20,
+    )
+    list_parser.add_argument(
+        "--environment",
+        help="Show only records from this environment",
+        required=True
+    )
+    list_parser.add_argument(
+        "--track",
+        help="Show only records from this track",
+        default=None
+    )
+
+    # if no "track" is given -> annotate all tracks
+    # "chart" indicates the graph. If no chart, is given, it is empty -> we need to write the queries to that we update all chart
+    #
+    # add [annotation] --environment=nightly --trial-timestamp --track --chart --text
+    add_parser = subparsers.add_parser("add", help="Add records")
+    add_parser.add_argument(
+        "configuration",
+        metavar="configuration",
+        help="",
+        choices=["annotation"])
+    add_parser.add_argument(
+        "--dry-run",
+        help="Just show what would be done but do not apply the operation.",
+        default=False,
+        action="store_true"
+    )
+    add_parser.add_argument(
+        "--environment",
+        help="Environment (default: nightly)",
+        default="nightly"
+    )
+    add_parser.add_argument(
+        "--trial-timestamp",
+        help="Trial timestamp"
+    )
+    add_parser.add_argument(
+        "--track",
+        help="Track. If none given, applies to all tracks",
+        default=None
+    )
+    add_parser.add_argument(
+        "--chart",
+        help="Chart to target. If none given, applies to all charts.",
+        choices=['query', 'script', 'stats', 'indexing', 'gc', 'index_times', 'merge_times', 'segment_count', 'segment_memory', 'io'],
+        default=None
+    )
+    add_parser.add_argument(
+        "--message",
+        help="Annotation message",
+        required=True
+    )
+
+    delete_parser = subparsers.add_parser("delete", help="Delete records")
+    delete_parser.add_argument(
+        "configuration",
+        metavar="configuration",
+        help="",
+        choices=["annotation"])
+    delete_parser.add_argument(
+        "--dry-run",
+        help="Just show what would be done but do not apply the operation.",
+        default=False,
+        action="store_true"
+    )
+    delete_parser.add_argument(
+        "--id",
+        help="Id of the annotation to delete. Separate multiple ids with a comma.",
+        required=True
+    )
+    return parser
+
+
+def main():
+    parser = arg_parser()
+    es = client.create_client()
+
+    args = parser.parse_args()
+    if args.subcommand == "list":
+        if args.configuration == "races":
+            list_races(es, args)
+        elif args.configuration == "annotations":
+            list_annotations(es, args)
+        else:
+            print("Do not know how to list [%s]" % args.configuration, file=sys.stderr)
+            exit(1)
+    elif args.subcommand == "add" and args.configuration == "annotation":
+        add_annotation(es, args)
+    elif args.subcommand == "delete" and args.configuration == "annotation":
+        delete_annotation(es, args)
+    else:
+        parser.print_help(file=sys.stderr)
+        exit(1)
+
+
+if __name__ == '__main__':
+    main()
diff --git a/client.py b/client.py
@@ -0,0 +1,31 @@
+def create_client():
+    import configparser
+    import os
+    # non-standard! requires setup.py!!
+    import elasticsearch
+    import certifi
+
+    def load():
+        config = configparser.ConfigParser(interpolation=configparser.ExtendedInterpolation())
+        config.read("%s/resources/rally-template.ini" % os.path.dirname(os.path.realpath(__file__)))
+        return config
+
+    complete_cfg = load()
+    cfg = complete_cfg["reporting"]
+    if cfg["datastore.secure"] == "True":
+        secure = True
+    elif cfg["datastore.secure"] == "False":
+        secure = False
+    else:
+        raise ValueError("Setting [datastore.secure] is neither [True] nor [False] but [%s]" % cfg["datastore.secure"])
+    hosts = [
+        {
+            "host": cfg["datastore.host"],
+            "port": cfg["datastore.port"],
+            "use_ssl": secure
+        }
+    ]
+    http_auth = (cfg["datastore.user"], cfg["datastore.password"]) if secure else None
+    certs = certifi.where() if secure else None
+
+    return elasticsearch.Elasticsearch(hosts=hosts, http_auth=http_auth, ca_certs=certs)