Bug in pinv? #214

sglyon · 2015-05-14T16:46:01Z

Suppose we had this matrix:

S = [494.1013106366904 46142.52662766732 13.948592561126539 -0.5761953808074929
 46142.52662766732 4.592191854163587e6 1176.2963747371257 -42.84536709656223
 13.948592561126539 1176.2963747371257 0.4603766442896071 -0.02261134866942
 -0.5761953808074929 -42.84536709656223 -0.02261134866942 0.001308271770670055]

Then consider the following code from v0.4 (master branch, 5-13-15):

julia> S = [494.1013106366904 46142.52662766732 13.948592561126539 -0.5761953808074929
        46142.52662766732 4.592191854163587e6 1176.2963747371257 -42.84536709656223
        13.948592561126539 1176.2963747371257 0.4603766442896071 -0.02261134866942
        -0.5761953808074929 -42.84536709656223 -0.02261134866942 0.001308271770670055]
4x4 Array{Float64,2}:
   494.101     46142.5          13.9486      -0.576195
 46142.5           4.59219e6  1176.3        -42.8454
    13.9486     1176.3           0.460377    -0.0226113
    -0.576195    -42.8454       -0.0226113    0.00130827

julia> versioninfo()
Julia Version 0.4.0-dev+4817
Commit c790369* (2015-05-13 14:00 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.3.0)
  CPU: Intel(R) Core(TM) i7-3635QM CPU @ 2.40GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

julia> pinv(S)
4x4 Array{Float64,2}:
    5.27356    -0.0261815     -138.437      -927.494
   -0.0261815   0.000132159      0.673668      4.44041
 -138.437       0.673668      3733.89      25625.4
 -927.494       4.44041      25625.4           1.80587e5

Now on the latest 0.3.8 release:

julia> S = [494.1013106366904 46142.52662766732 13.948592561126539 -0.5761953808074929
        46142.52662766732 4.592191854163587e6 1176.2963747371257 -42.84536709656223
        13.948592561126539 1176.2963747371257 0.4603766442896071 -0.02261134866942
        -0.5761953808074929 -42.84536709656223 -0.02261134866942 0.001308271770670055]
4x4 Array{Float64,2}:
   494.101     46142.5          13.9486      -0.576195
 46142.5           4.59219e6  1176.3        -42.8454
    13.9486     1176.3           0.460377    -0.0226113
    -0.576195    -42.8454       -0.0226113    0.00130827

julia> versioninfo()
Julia Version 0.3.8
Commit 79599ad* (2015-04-30 23:40 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.0.0)
  CPU: Intel(R) Core(TM) i7-3635QM CPU @ 2.40GHz
  WORD_SIZE: 64
  BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

julia> pinv(S)
4x4 Array{Float64,2}:
  0.500222    -0.00332121   -6.62205      0.94274
 -0.00332121   2.26771e-5    0.0423813   -0.00603419
 -6.62205      0.0423813    93.8221     -13.3545
  0.94274     -0.00603419  -13.3545       1.90086

Is there a reason why pinv gives such a different answer for the exact same input matrix?

The text was updated successfully, but these errors were encountered:

sglyon · 2015-05-14T17:07:28Z

For what it is worth, I tested against scipy.linalg.pinv and scipy.linalg.pinv2 and it looks like the pinv2 function there gives the same answer as what we get on 0.3.8 whereas the pinv function gives a similar result to what we are getting on 0.4dev:

In [13]: import numpy as np

In [14]: import scipy.linalg as la

In [15]: S = np.array([[494.1013106366904, 46142.52662766732, 13.948592561126539, -0.5761953808074929],
[46142.52662766732, 4.592191854163587e6, 1176.2963747371257, -42.84536709656223],
[13.948592561126539, 1176.2963747371257, 0.4603766442896071, -0.02261134866942],
[-0.5761953808074929, -42.84536709656223, -0.02261134866942, 0.001308271770670055],
])

In [16]: np.set_printoptions(precision=4)

In [17]: la.pinv(S)
Out[17]:
array([[  5.2736e+00,  -2.6182e-02,  -1.3844e+02,  -9.2749e+02],
       [ -2.6182e-02,   1.3216e-04,   6.7367e-01,   4.4404e+00],
       [ -1.3844e+02,   6.7367e-01,   3.7339e+03,   2.5625e+04],
       [ -9.2749e+02,   4.4404e+00,   2.5625e+04,   1.8059e+05]])

In [18]: la.pinv2(S)
Out[18]:
array([[  5.0022e-01,  -3.3212e-03,  -6.6220e+00,   9.4274e-01],
       [ -3.3212e-03,   2.2677e-05,   4.2381e-02,  -6.0342e-03],
       [ -6.6220e+00,   4.2381e-02,   9.3822e+01,  -1.3354e+01],
       [  9.4274e-01,  -6.0342e-03,  -1.3354e+01,   1.9009e+00]])

the docs for these routines say that scipy.lingalg.pinv uses a least squares solver where scipy.linalg.pinv2 uses the SVD.

jiahao · 2015-05-14T17:13:16Z

The change is intentional. The default inverse condition number threshold was changed in 0.4; see Issue JuliaLang/julia#8859.

julia> pinv(S, 1e-9) #On 0.4, basically reproduces the 0.3.8 answer
4x4 Array{Float64,2}:
  0.500222    -0.00332121   -6.62205      0.94274   
 -0.00332121   2.26771e-5    0.0423813   -0.00603419
 -6.62205      0.0423813    93.8221     -13.3545    
  0.94274     -0.00603419  -13.3545       1.90086

I suspect that numpy uses different regularizations in pinv and pinv2 also.

Note that your matrix has a condition number ~1e12

julia> cond(S)
8.461070041485869e11

julia> svdvals(S)
4-element Array{Float64,1}:
  4.59266e6 
 30.6059    
  0.010396  
  5.42798e-6

and so the default in 0.3.8 produced essentially a rank 3 pseudoinverse. The default threshold in 0.4 is more conservative and produces a rank 4 pseudoinverse.

julia> svdvals(pinv(S, 1e-9))
4-element Array{Float64,1}:
 96.1906     
  0.0326735  
  2.17739e-7 
  4.10145e-16

julia> svdvals(pinv(S))
4-element Array{Float64,1}:
  1.8423e5  
 96.1906    
  0.0326735 
  2.17739e-7

sglyon · 2015-05-14T17:19:46Z

Thanks for the edited comment.

Not being a numerical linear algebra expert myself, what does this mean for routines that use pinv that now give very different answers based on the version of Julia that is used? I guess my question is, should I as a user be happy with this intentional change and why?

jiahao · 2015-05-14T17:29:31Z

There is no general answer to your question, unfortunately. It depends on what you want to use the pseudoinverse for.

andreasnoack · 2015-05-14T17:57:35Z

The pseudo-inverse is not continuous when the rank of the matrix changes. The reason your result changes so much is that the rank estimate for your matrix changes between 0.3 and 0.4. To summarize the discussion that @jiahao links to, there exist two kinds of numerical rank deficient matrices: the first kind has a very well determined rank, say k, where the magnitude of the sorted singular drops dramatically at k, the second kind is often related to so called ill-posed problems and where the magnitude of the sorted singular values drops regularly such that the numerical rank cannot be easily be determined, but at the same time spanning a range so wide that the matrix is ill conditioned. The tolerance of the pinv function has to balance these two cases.

sglyon · 2015-05-14T17:59:48Z

OK great, thank you both for the tips.

simonster added the linear algebra label May 14, 2015

sglyon mentioned this issue May 14, 2015

Tests fail on recent 0.4dev gragusa/MomentBasedEstimators.jl#18

Open

KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in pinv? #214

Bug in pinv? #214

sglyon commented May 14, 2015

sglyon commented May 14, 2015

jiahao commented May 14, 2015

sglyon commented May 14, 2015

jiahao commented May 14, 2015

andreasnoack commented May 14, 2015

sglyon commented May 14, 2015

Bug in pinv? #214

Bug in pinv? #214

Comments

sglyon commented May 14, 2015

sglyon commented May 14, 2015

jiahao commented May 14, 2015

sglyon commented May 14, 2015

jiahao commented May 14, 2015

andreasnoack commented May 14, 2015

sglyon commented May 14, 2015