-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Add a few more test cases - Avoid memory allocations in the benchmark. This ensures that we are benchmarking just the copy / transpose and not memory allocation. - Check the outputs once after the final iteration of each sub-benchmark. This makes it quicker to spot when an optimization produces wrong results. - Add reference transpose implementation comparison to benchmark The reference transpose roughly matches the non-blocked copy code path in rten-tensor. It thus shows the effect of the blocking copy optimizations.
- Loading branch information
1 parent
ea55853
commit e82cf19
Showing
1 changed file
with
61 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters