-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[proto] Small optim for perspective op on images #6907
[proto] Small optim for perspective op on images #6907
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vfdev-5 just 2 questions. Looks good overall.
base_grid[..., 1].copy_(y_grid) | ||
base_grid[..., 2].fill_(1) | ||
|
||
rescaled_theta1 = theta1.transpose(1, 2) / torch.tensor([0.5 * ow, 0.5 * oh], dtype=dtype, device=device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do in-place division?
numer_points = torch.matmul(points, theta1.T) | ||
denom_points = torch.matmul(points, theta2.T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to understand, previously when we measured this specific change for bboxes it didn't improve the speed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I measured all changes together: concat+single matmul + inplace + aminmax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, it improved the speed for bounding boxes, but the PR didn't measure the impact on images. Since bounding box tensors are usually a lot smaller than images, the perf regression for images was not noticed there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were no perf regression at all as it was using previous implementation. For bboxes shape=(1000, 4)
"concat+single matmul" only trick does not bring any speed up on cpu. While working on images (this PR) I see that same trick brings even a slowdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK if you measured it now and you know it doesn't slow us down it's fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, feel free to merge after testing the in-place division.
Summary: * [proto] small optim for perspective op on images, reverted concat trick on bboxes * revert unrelated changes * PR review updates * PR review change Reviewed By: NicolasHug Differential Revision: D41265184 fbshipit-source-id: 12073a164180b2ed392dd455106f6411bab9a317
cc @datumbox @bjuncek @pmeier