Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add allocation free matrix multiplication and solving #4595

Closed
wants to merge 1 commit into from

Conversation

andreasnoack
Copy link
Member

Here are the allocation free versions of the dense matrix algebra. I have tried to make all the versions use the BLAS way of updating. It is a major rewrite of matmul so this should wait for after 2.0, I think. The generic_matmatmul! is a little fancy so I am not sure I got it right, so please review it.

This was motivated by some work on iterative methods for solving linear systems where it was useful to have a common name and API for the allocation free multiplication. I hope that find an API such that you are free to define a type and A_mul_B! and then you can start solving linear systems and eigensystems efficiently.

@timholy
Copy link
Member

timholy commented Oct 20, 2013

I'm a little bit confused here...there's a lot of work on allocation-free multiplication, but we already had that. What's different? (What do you mean by "the BLAS way of updating"?)

@andreasnoack
Copy link
Member Author

The idea was to have functions of the type y<-αAx+βy which is what I meant with BLAS type updating. That is the major change. Besides that I tried do a little cleanup in the code.

@dmbates
Copy link
Member

dmbates commented Oct 21, 2013

@andreasnoackjensen There is a problem in the logic of using y <- αAx+βy for general cases. The result is incorrect if A is not square. I just encountered a situation using

Version 0.2.0-rc1+81
"Commit 7f498f5 2013-10-21 04:26:31 UTC"

where the result of multiplying a 40 by 60 sparse matrix by a vector of length 60 was a vector of length 60, not 40 as it should be.

You will need to back off the y <- αAx+βy form for non-square A

I would say this is an urgent fix as the current version produces incorrect answers. Let me know if I can help.

@StefanKarpinski
Copy link
Member

@dmbates, is this just in the pull request or on master? Do you have the exact code that produced the error?

@andreasnoack
Copy link
Member Author

I don't think we need to back off y <- αAx+βy. The problem is that for * I determine the size of y from x which is incorrect. We should use A.m. I'll fix it.

@timholy
Copy link
Member

timholy commented Oct 21, 2013

I agree that there isn't a fundamental problem.

I recognize it's like axpy, but does LAPACK actually implement y <- αAx+βy? Or are you implementing it in Julia?

@andreasnoack
Copy link
Member Author

I have psuhed a fix to master.

@timholy BLAS has xgemv and xgemm which do y <- αAx+βy and C<-αAB+βC and they are called for all the BlasFloat cases. For the generic_matxxxmul! I have tried to mimic that behaviour.

@timholy
Copy link
Member

timholy commented Oct 21, 2013

Didn't know that, thanks for explaining. This makes more sense to me now. I was worried that we were basically forcing extra operations on users in circumstances where they might most often choose α=1 and β=0. (Not that it would matter greatly since you have an O(N^2) or O(N^3) operation which will dominate anyway.)

@dmbates
Copy link
Member

dmbates commented Oct 21, 2013

@StefanKarpinski I should have made it clear that the problem was on master but @andreasnoackjensen has fixed it now.

@dmbates
Copy link
Member

dmbates commented Oct 21, 2013

@andreasnoackjensen I suggest changing the 1 and 0 in

(*){TA,S,Tx}(A::SparseMatrixCSC{TA,S}, x::AbstractVector{Tx}) = A_mul_B!(1, A, x, 0, zeros(promote_type(TA,Tx), A.m))

to one(Tx) and zero(Tx). The compiler may be smart enough to promote them from Int values but it doesn't hurt to generate the correct type if you already know what it should be.

@andreasnoack
Copy link
Member Author

@timholy For the BlasFloat case, the α=1 and β=0 have always been there but hidden. For the other types, the change adds some extra operations, but I think it is worth it because in many situations you can actually avoid a temporary array and, as you mention, for the situations where α=1 and β=0 the costs are not big.

@dmbates I have now changed the functions to use one and zero instead of 0 and 1. There were not any tests for sparse-dense multiplication which explains why my error got through so I have added a few tests.

@andreasnoack
Copy link
Member Author

Just to clarify the 2.0 considerations related to this request. This should not break anything exported or documented as far as I can see, but it breaks the API for allocation free multiplication which is not exported and documented.

@ViralBShah
Copy link
Member

On a different note, I really do not like the proliferation of these ! versions of the mul and div operators. It just feels inelegant, and this will get better when @JeffBezanson improves allocation and GC. How would you feel about moving all these operators into a package - say NumericExtensions?

@StefanKarpinski
Copy link
Member

That doesn't really make much sense. The non-mutating versions need to exist in Base since A.' * B translates into At_mul_B(A,B) and it only makes sense to define At_mul_B in terms of At_mul_B! as in

At_mul_B(A,B) = At_mul_B!(allocate_output(),A,B)

@andreasnoack
Copy link
Member Author

Most of the code would need to be copied to the package. I have no idea about the things @JeffBezanson is planning on allocation and gc, but my guess is that the function bodies would need to be something like this request. Then the header could be changed to something more elegant than A_mul_B! later. But again, my imagination might be limited so if this functionality is better done in another way, I am totally fine with that.

@andreasnoack andreasnoack deleted the anj/matmul branch November 19, 2013 05:55
@andreasnoack andreasnoack restored the anj/matmul branch November 27, 2013 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants