Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving tournament report [Closes #249] #250

Merged
merged 8 commits into from
Apr 4, 2017

Conversation

keyur9
Copy link
Contributor

@keyur9 keyur9 commented Mar 8, 2017

Added parameters in compare parsers to enable saving tournament report in csv or md format.

compare_parser.add_argument(
        "--report-format",
        help="define the output format for the command line report (default: markdown).",
        choices=["markdown", "csv"],
        default="markdown")
compare_parser.add_argument(
        "--report-file",
        help="write the command line report also to the provided file",
        default="")

@danielmitterdorfer
Copy link
Member

Thanks for the PR @keyur9. I had a quick look. It's a good start but you also need to use these parameters, i.e. ComparisonReporter (in reporter.py) needs to write the report file (see https://github.com/elastic/rally/blob/master/esrally/reporter.py#L242 how it is done in the SummaryReporter that is used for normal reports).

@keyur9
Copy link
Contributor Author

keyur9 commented Mar 8, 2017

Thank you for the feedback @danielmitterdorfer. As suggested, have updated the ComparisonReporter (wrote the report file similar to how its done in the SummaryReporter) in reporter.py.

@danielmitterdorfer
Copy link
Member

Thanks for the follow-up. Could you please refactor it in a way that both reporters call just one implementation? I'd just implement write_single_report as a function and have then both reporter implementations call it.

@keyur9
Copy link
Contributor Author

keyur9 commented Mar 9, 2017

Thank you for the quick response. Sure, I'll work on refactoring and complete it before end of the week.

@keyur9
Copy link
Contributor Author

keyur9 commented Mar 12, 2017

Hello @danielmitterdorfer, as suggested I've implemented write_single_report as a function and call the same function from both reporter. I'd appreciate your feedback / suggestions if there is any way I could refactor the code.

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating @keyur9. I left a couple of comments. Can you please also run your changes so it once shows a summary report and once a comparison report so you can see that both implementations still work?

@@ -34,6 +34,26 @@ def print_internal(message):
def print_header(message):
print_internal(console.format.bold(message))

def write_single_report(self, report_file, headers, data, write_header=True, show_also_in_console=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self is an (implicit) reference to the current object. As you refactored it to be a function you need to remove self here and also move all related helper functions (like format_as_markdown, format_as_csv).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: There should be 2 blank lines.

@@ -34,6 +34,26 @@ def print_internal(message):
def print_header(message):
print_internal(console.format.bold(message))

def write_single_report(self, report_file, headers, data, write_header=True, show_also_in_console=True):
report_format = self._config.opts("reporting", "format")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to pass the report format as a parameter instead of using the config here.

# ensure that the parent folder already exists when we try to write the file...
rio.ensure_dir(rio.dirname(normalized_report_file))
with open(normalized_report_file, mode="a+", encoding="UTF-8") as f:
f.writelines(formatter(headers, data, write_header))

class Stats:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: There should be 2 blank lines.

@keyur9
Copy link
Contributor Author

keyur9 commented Mar 15, 2017

Hello @danielmitterdorfer , Thank you for all the feedback. I've fixed all those issue and both implementations are working as expected. However, I faced one issue while generating Tournament report file. The output of difference between baseline and contender is shown in color in console and the diff method returns the difference in color. But when writing the color output in file it is getting converted to something like this \[31;1m+1.49723\[0m. So, I've changed the code that will not print the colored output in console as it will give appropriate value in report file. Would love to hear your thoughts on it.
Attached output Console colored: ON and Console colored: OFF

@danielmitterdorfer
Copy link
Member

danielmitterdorfer commented Mar 27, 2017

@keyur9 Thanks for the update. I'd really love to keep the colors in the command line output. How about you generate the report twice, once for console consumption (with the ANSI escape sequence) and once for file output (without the ANSI escape sequences)? Generating the report is quite fast and I wouldn't bother doing it twice.

Something like this:

class ComparisonReporter:
    def report(self, r1, r2):
        # ...
        metric_table_plain = self.metrics_table(baseline_stats, contender_stats, plain=True)
        metric_table_rich = self.metrics_table(baseline_stats, contender_stats, plain=False)
        # writes metric_table_rich to console, writes metric_table_plain to file
        self.write_report(metric_table_plain, metric_table_rich)

    def metrics_table(self, baseline_stats, contender_stats, plain):
        self.plain=plain
        metrics_table = []
        # ...
        return metrics_table

    def diff(self, baseline, contender, treat_increase_as_improvement, formatter=lambda x: x):
        diff = formatter(contender - baseline)
        if self.plain:
            color_greater = lambda x: x
            color_smaller = lambda x: x
            color_neutral = lambda x: x
        elif treat_increase_as_improvement:
            color_greater = console.format.green
            color_smaller = console.format.red
            color_neutral = console.format.neutral
        else:
            color_greater = console.format.red
            color_smaller = console.format.green
            color_neutral = console.format.neutral

        if diff > 0:
            return color_greater("+%.5f" % diff)
        elif diff < 0:
            return color_smaller("%.5f" % diff)
        else:
            # tabulate needs this to align all values correctly
            return color_neutral("%.5f" % diff)

You just need to add a new parameter to the function that actually writes the report data and write metric_table_plain to the output file and metric_table_rich to console. For the summary reporter, you can just provide the same metric table twice (as there are no ANSI escape sequences). Apart from that your change looks good. What do you think?

@keyur9
Copy link
Contributor Author

keyur9 commented Mar 30, 2017

Hello @danielmitterdorfer, Thanks for the feedback. Generating the report twice didn't took long and have made the changes. Now we would be able to see the console output in color and save the tournament report in md and csv file.

@danielmitterdorfer
Copy link
Member

Thanks for following-up @keyur9! I'll try to review it later today.

@danielmitterdorfer
Copy link
Member

I did a few tests and unfortunately, a few of them failed. Can you please have a look @keyur9?

Command:

esrally --distribution-version=5.3.0 --test-mode --report-format=csv --laps=2 --report-file=/Users/daniel/Desktop/report.csv

Trace:

Traceback (most recent call last):
  File "/Users/daniel/Projects/rally/esrally/racecontrol.py", line 303, in run
    pipeline(cfg)
  File "/Users/daniel/Projects/rally/esrally/racecontrol.py", line 42, in __call__
    self.target(cfg)
  File "/Users/daniel/Projects/rally/esrally/racecontrol.py", line 244, in from_distribution
    return race(Benchmark(cfg, distribution=True))
  File "/Users/daniel/Projects/rally/esrally/racecontrol.py", line 216, in race
    benchmark.teardown(cancelled)
  File "/Users/daniel/Projects/rally/esrally/racecontrol.py", line 160, in teardown
    reporter.summarize(self.race_store, self.metrics_store, self.cfg, self.track)
  File "/Users/daniel/Projects/rally/esrally/reporter.py", line 15, in summarize
    SummaryReporter(race_store, metrics_store, cfg, lap).report(track)
  File "/Users/daniel/Projects/rally/esrally/reporter.py", line 267, in report
    self.write_report(metrics_table, meta_info_table)
  File "/Users/daniel/Projects/rally/esrally/reporter.py", line 276, in write_report
    write_single_report("%s.meta" % report_file, headers=["Name", "Value"], data=meta_info_table, show_also_in_console=False)
TypeError: write_single_report() missing 2 required positional arguments: 'report_format' and 'cwd'

Command (obviously you need to choose two different timestamp depending on the output of esrally list races):

esrally compare --baseline=20170331T081942Z --contender=20170331T082445Z
  • Expected result: The report is printed once on the console using colors. No file is written.
  • Actual result: The report is printed twice: Once without colors, once with colors

Command:

esrally compare --baseline=20170331T081942Z --contender=20170331T082445Z --report-format=csv --report-file=/Users/daniel/Desktop/comparison.csv
  • Expected result: The report is printed once on the console using colors. The file /Users/daniel/Desktop/comparison.csv is written and contains the same data but without any ANSI escape sequences.
  • Actual result: The report is printed twice: Once without colors, once with colors. Also the file contents are written twice, once with ANSI escape sequences and once without.

@keyur9
Copy link
Contributor Author

keyur9 commented Mar 31, 2017

Hello @danielmitterdorfer. Thank you for the feedback. I have fixed the issues and now report is printed once in console and written once to file without ANSI escape sequence.

Now the output for the command -

esrally compare --baseline=20170331T081942Z --contender=20170331T082445Z
  • Expected result: The report is printed once on the console using colors. No file is written.
  • Actual result: The report is printed once on the console using colors. No file is written.
esrally compare --baseline=20170331T081942Z --contender=20170331T082445Z --report-format=csv --report-file=/Users/daniel/Desktop/comparison.csv
  • Expected result: The report is printed once on the console using colors. The file /Users/daniel/Desktop/comparison.csv is written and contains the same data but without any ANSI escape sequences.
  • Actual result: The report is printed once on the console using colors. The file /Users/daniel/Desktop/comparison.csv is written and contains the same data but without any ANSI escape sequences.

Please let me know if you're fine with changes.

@danielmitterdorfer
Copy link
Member

danielmitterdorfer commented Apr 4, 2017

@keyur9 We're almost there. The only test case that failed for me was:

esrally --distribution-version=5.3.0 --test-mode --report-format=csv --laps=2 --report-file=/Users/daniel/Desktop/report.csv

Rally writes not only the actual report but also a second file containing the git commit hash of the Elasticsearch version that it has benchmarked. When Rally attempts to write it with your modified version, we get (trace is shortened):

  File "/Users/daniel/Projects/rally/esrally/reporter.py", line 276, in write_report
    write_single_report("%s.meta" % report_file, headers=["Name", "Value"], data=meta_info_table, data_rich = meta_info_table, show_also_in_console=False)
TypeError: write_single_report() got an unexpected keyword argument 'data'

The keyword argument is now data_plain instead of data and you also forgot to add the new arguments report_format and cwd. If you change line 276 instead to

write_single_report("%s.meta" % report_file, report_format, cwd, headers=["Name", "Value"], data_plain=meta_info_table, data_rich=meta_info_table, show_also_in_console=False)

everything works fine.

I think this is the last change to get this PR in.

@keyur9
Copy link
Contributor Author

keyur9 commented Apr 4, 2017

Hello @danielmitterdorfer, Thank you for the feedback. I added both the missing parameters and changed the keyword to data_plain.

Please let me know if you're fine with those changes.

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for following up @keyur9! The changes look good to me. I'll merge them soon. The feature will be released with the next release 0.5.3.

@danielmitterdorfer danielmitterdorfer merged commit fcf066d into elastic:master Apr 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants