-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add buffersize argument to rdfind #178
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stretch goal:
I am thinking of what the optimal buffersize is. It has been 19 years since I wrote rdfind, and hardware has changed quite a bit since then.
It would be good to have data on
- a fast typical desktop (amd64 cpu, fast ssd)
- a smaller system (raspberry pi3, sd card and 32 bit?)
- a server (pretty similar to the desktop I assume, but storage is remote)
also, what is good for small files and what is good for large files?
Fileinfo.cc
Outdated
@@ -80,7 +81,7 @@ Fileinfo::fillwithbytes(enum readtobuffermode filltype, | |||
if (checksumtype != Checksum::checksumtypes::NOTSET) { | |||
Checksum chk(checksumtype); | |||
|
|||
char buffer[4096]; | |||
char buffer[buffersize]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a variable length array - not standard C++.
I think it makes sense to only allocate it once. This is C++17, so we can't use std::span.
Please make a std::vector outside the loop, sized to buffersize and pass it by reference into this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like having this on the stack because of ease and also if this becomes multi-threaded sometime somewhere in the future.
I atleast changed to std::vector, but c++ is not my stong side so please review carefully :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its kind of weird that the variable length array passes all the do_quality_checks.sh
when its not in the c++ standard :)
also, the title of the MR and the commit do not need to say it is rdfind, that is pretty implicit 😃 |
@@ -65,6 +65,10 @@ usage() | |||
"device and inode\n" | |||
<< " -checksum md5 |(sha1)| sha256 | sha512\n" | |||
<< " checksum type\n" | |||
<< " -buffersize N (N=4096,0<N<1000KB)\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Form my understanding kB means 1000Bytes and KB means 1024Bytes, and in this case Im referring to 1024Bytes.
But i guess just saying this can restart the vim/emacs war again :)
forgot: please add a test |
15c308f
to
3f6efc9
Compare
Most should be fixed |
3f6efc9
to
d3ea8a8
Compare
had some error in sh script. |
d3ea8a8
to
e22a83e
Compare
add the argument to change buffer size of checksum calculation, may increase speed on some checksum algorithms and disk types. Signed-off-by: Robert Marklund <[email protected]>
Signed-off-by: Robert Marklund <[email protected]>
Signed-off-by: Robert Marklund <[email protected]>
d91b21c
to
c0e9606
Compare
I removed "add shellcheck script and workflow", from this patchset and will move that to new PR that fixes all issues aswell |
Please review again to see if the changes are sufficient. |
I am sorry - I had made review comments but forgot to actually send them! On top of that, I missed I had to approve the CI runs. I worked a bit on this to get this over the finish line. Most of the changes are things you could not reasonably know. Such things take long time to discuss over the internet, had it been at work we would just had a discussion and get it done quickly. The differences:
I hope you are ok with the changes, I made them as fixups to your commits so you are still the author. Please let me know if you want something else! see the branch here: #180 |
closing, in favour of #180 |
add the argument to change buffer size of checksum calculation, may increase speed on some checksum algorithms and disk types.