Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add --stats flag and implement basic stat tracking (closes #411) #799

Closed
wants to merge 1 commit into from
Closed

Conversation

balajisivaraman
Copy link
Contributor

@balajisivaraman balajisivaraman commented Feb 13, 2018

Creating this PR for review purposes and as a WIP thread for adding the --stats flag to ripgrep.

  • Implement tracking for basic stats (no. of matches, files with matches, files searched, and time taken)
  • Ensure --stats is ignored when --files, --files-with-matches or --files-without-match are passed
  • Implement tracking of bytes searched for --no-mmap searches using the counting_reader trick. This will require changing the worker's run method to return an Optional bytes_searched attribute along with the existing match_count. We also need to ensure this happens only when --stats is passed.

Other thoughts:

  • Have used the default Duration implementation from std as did not want to pull in any other dependencies such as time or chrono, and this served the purpose very well.

  • Have defaulted the time to always display in seconds since this is what ag does as well. rg also prints only 3 numbers after the decimal point, as opposed to ag's 6.

    The only reasoning behind this is that Rust only allows us to control the digits printed after the decimal point, not overall. If we run a very long search, we don't want to print something like 2751.648931 seconds. So have chosen the nicer option of 3 digits after the decimal point.

    I think ag prints 6 digits overall, with the number of digits printed after the decimal point depending on the overall width of the string. (This is the behaviour I've observed; have not verified this with the code.)

@balajisivaraman
Copy link
Contributor Author

I'll close this PR as the current implementation is simply wrong. When I looked at the code for this originally, I overlooked the fact that the searchers return matching line count and not the individual match count, which is what we want for the --stats flag. My apologies for that.

I'll work on the proper implementation, which I expect to be a bit more involving than this one. In the meantime, I don't want this clogging up the PR space here.

@balajisivaraman balajisivaraman deleted the fix_411 branch February 20, 2018 04:27
Copy link
Owner

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@balajisivaraman Thanks for working on this! I know you closed this, but you might consider that it is possible to get this merged without much additional work. For example, if you change "match count" to "matched line count" (or similar), then that would be correct and still useful IMO. :) I defer to you! Regardless, I left some small amount of feedback.

the time taken for the entire search to complete.

Note that this flag has no effect if --files, --files-with-matches or
--files-without-match is passed.");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put the "); on its own line please?


let arg = RGArg::switch("stats")
.help(SHORT).long_help(LONG);

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this blank line please.

@@ -373,6 +395,15 @@ fn eprint_nothing_searched() {
Try running again with --debug.");
}

fn print_stats(match_count: u64, paths_searched: u64, paths_matched: u64, time_elapsed: Duration) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All lines of source code in ripgrep should be wrapped to 79 columns inclusive (excepting URLs or other things that are inconvenient to wrap). Please see examples of other function declarations for formatting tips.

@balajisivaraman balajisivaraman restored the fix_411 branch February 20, 2018 14:16
@balajisivaraman
Copy link
Contributor Author

@BurntSushi, I wouldn't mind getting this PR merged at all. :-)

The only reason why I closed this was I'm going to begin work on the full implementation of it with bytes searched and individual count, so I thought we wouldn't want this anyway.

But seeing as that is more involving change with multiple moving parts, it might take some time. So it makes sense to go ahead with this partial implementation in the meanwhile.

I'll update the PR to reflect your comments. Thanks! :-)

@BurntSushi
Copy link
Owner

@balajisivaraman This was great work, thanks so much! I merged this in 00520b3 with no real changes other than fixing up style (lines should be limited to 79 columns inclusive) and resolving conflicts. I also tweaked the wording of output slightly to indicate that total matches actually corresponds to the total number of matched lines. (So we leave the door open to emit another statistic that reports the total matches.)

@balajisivaraman balajisivaraman deleted the fix_411 branch March 10, 2018 16:05
@balajisivaraman
Copy link
Contributor Author

@BurntSushi, Thank you for merging this. And my apologies for not updating this PR. I was waiting for the other PR (match_line_count) to get merged so I could update it properly and completely forgot about it. Thanks again. 😄

@BurntSushi
Copy link
Owner

BurntSushi commented Mar 10, 2018

@balajisivaraman No worries! Sometimes my work style isn't amenable to waiting too long, and is susceptible to short bursts of maintenance activity. :-) Context switches are extremely expensive for me, so I figured that I would just amortize them by processing all of your PRs, which all were somewhat related!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants