-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace instances of np.trace(np.dot(...
when computing the trace inner product or Frobenius norm.
#430
Comments
This hurts me a bit in my soul. I'm generally supportive of this. I used einsum for accelerating the trace for products of matrices a bunch when re-implementing the germ selection algorithm and indeed found it to be much faster. One thing that I did find back then when doing profiling was that even though you can do the trace of the product in one fell swoop with einsum, it was faster to actually stop a bit short and compute the diagonal entries and then sum those with np.sum (no idea why this was/is the case). You can see an example of where I used that pattern here: pyGSTi/pygsti/algorithms/germselection.py Line 3473 in ee21585
Another observation I made at that time was that einsum was much faster at performing this trace/diagonal calculation subroutine when acting on products on inputs of the form A and A^T, than on general pairs of matrices A and B. If I recall correctly I concluded this was related to some optimization that must have been happening under the hood due to the fact that A^T was a view into A (you're better equipped to understand exactly what would be happening here, but my guess at the time was better cache behavior). I concluded this was related to being a view in part by checking what happened if instead I passed in a copy of A^T and found that the performance fell back in line with general pairs of matrices. I mention all of this because the trace(A@A^T) pattern is one that appears in a bunch of the cost functions we use in experiment design (and probably other places) due to its relation to something called A-optimal experiment design. One last incoherent nugget extracted from the deeper confines of my memory. I have found that the aforementioned performance boost from using einsum for diagonals/traces of products of matrices of the form A and A^T is big enough that it in some instances justified doing some otherwise weird looking additional calculations. For example, in germ selection there was one particular hotspot in the code related to evaluating an expression of the form trace(A@C@A^T) with A having shape (N, r), C having shape (r,r) and N>>r. You can do this with a one-liner with einsum, but I found that it was significantly advantageous to first take a cholesky decomposition (even with the additional cost that imposed) and then fold the square roots of C (really half since you only need to do this on one since since we know we'll just be taking a transpose) into A before doing the einsum. Thanks for coming to my TED talk. |
Whoops, just read your PR related to this. vdot is cool too. |
I'm generally for this, and would suggest we add comments to the effected lines where it seems appropriate, as while einsum and other options are faster tend to be less readable. One of the frustrations of Python is that in many circumstances you can have either speed or readability but not both at the same time :) |
@coreyostrove and @enielse, I like vdot because it's fast, consistent (about how it always handles conjugation in the first argument), and readable (once you know what it's useful for). @coreyostrove, regarding
There is indeed something going on with As for
The benefits of Cholesky here don't surprise me! It's cool that you figured this out :) |
Closed with merged PR :) |
(On the latest commit of develop, at time of writing. pyGSTi 0.9.12.)
The standard inner product on the space of real$m \times n$ matrices is $(A, B) \mapsto \text{trace}(A^T B)$ . There are two candidates for the standard inner products for complex matrices; either $(A, B) \mapsto \text{trace}(A^\dagger B)$ or $(A, B) \mapsto \text{trace}(B^\dagger A)$ , which differ up to a complex conjugate.
Several places in pyGSTi require computing these inner products, and explicitly compute a matrix-matrix product before taking a trace. Some places even do this as a way to compute (squared) Frobenius norms. The easy way to find these instances are to search for the text "
_np.trace(_np.dot(
" (I prepend with underscores since that's pyGSTi's import convention, but there are hits without the underscores as well).We should replace these instances with a more efficient and consistent pattern. One way to do this is with
np.einsum
, along with a manual conjugation of one of the matrices. I suggestnp.vdot
, since that will handle conjugation consistently for us.Thoughts, @coreyostrove, @sserita?
The text was updated successfully, but these errors were encountered: