-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RandomGeoSampler: fix performance regression #1968
RandomGeoSampler: fix performance regression #1968
Conversation
height = (bounds.maxy - bounds.miny - t_size[0]) // res | ||
# May be negative if bounding box is smaller than patch size | ||
width = (bounds.maxx - bounds.minx - t_size[1]) / res | ||
height = (bounds.maxy - bounds.miny - t_size[0]) / res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the width/height in pixel units. It no longer needs to be an integer, float is fine too. We cast to integer elsewhere.
|
||
minx = bounds.minx | ||
miny = bounds.miny | ||
|
||
# random.randrange crashes for inputs <= 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a dead comment, we no longer use random.randrange
and the input can no longer be negative
|
||
minx = bounds.minx | ||
miny = bounds.miny | ||
|
||
# random.randrange crashes for inputs <= 0 | ||
if width > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer need to guard for negative numbers, it doesn't matter if the sample starts outside the bounds of the image for bounding boxes smaller than the patch size.
if height > 0: | ||
miny += torch.rand(1).item() * height * res | ||
# Use an integer multiple of res to avoid resampling | ||
minx += int(torch.rand(1).item() * width) * res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only real important line
Evaluation using #1972:
A 60% speedup for RandomGeoSampler when data is preprocessed (same CRS, res, TAP, COG). |
Fixes a bug I introduced all the way back in #477. Thanks @yichiac for noticing this!
We want to try to sample from a pixel-aligned grid whenever possible to avoid resampling in the average case. With the changes in this PR, we should revert back to the behavior we documented in our paper, where preprocessing greatly improves sampling performance. GridGeoSampler was not affected by this bug.
I still want to create an I/O benchmarking script similar to what we used in our original paper to ensure that this indeed works as advertised, so will keep as a draft for now.