Refactor `sinkhorn` and `sinkhorn2` #100

devmotion · 2021-06-08T13:17:01Z

I apologize in advance for the amount of changes in this PR but I didn't manage to keep it smaller without leaving the package in a half-broken state.

Basically, this PR unifies sinkhorn, sinkhorn_stabilized, and sinkhorn_stabilized_epsscaling with a single sinkhorn(mu, nu, C, eps, algorithm; kwargs...) syntax (the name sinkhorn seems a bit redundant, see comments below). More details:

sinkhorn(mu, nu, C, eps; kwargs...) = sinkhorn(mu, nu, C, eps, SinkhornGibbs(); kwargs...), i.e., without algorithm still the standard Sinkhorn algorithm is used as before
sinkhorn2 allows to specify an algorithm as well (default: SinkhornGibbs()), and hence supports all Sinkhorn variants (and also supports the additional regularization term for all algorithms)
SinkhornEpsilonScaling works both with the regular Sinkhorn algorithm and the stabilized version
SinkhornStabilized supports batches of histograms as well (requires NNlib.batched_mul! for performance and GPU support)
all methods work on the GPU (with and without epsilon scaling)
existing methods and keyword arguments are deprecated, i.e., changes should be non-breaking

The name sinkhorn(...) seems a bit redundant since it is already implied by the name of the algorithms (apart from the default choice). However, I did not want to make too many changes in the PR. One approach would be to define a type for entropically regularized OT problems (this could also be used in the SinkhornSolver struct to group together source and target marginals, cost matrix and regularization parameter) and then to use the solve approach (which is already used internally now without depending on CommonSolve). Then one could e.g. still use SinkhornGibbs() as the default algorithm for such problems but also write solve(prob, SinkhornStabilized(); kwargs...) etc. One could even keep sinkhorn(mu, nu, C, eps, alg; kwargs...) = solve(EntropicOTProblem(mu, nu, C, eps), alg; kwargs...).

src/utils.jl

test/entropic/sinkhorn_stabilized.jl

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

…port.jl into dw/sinkhorn_refactor

coveralls · 2021-06-08T13:43:02Z

Pull Request Test Coverage Report for Build 928044999

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

300 of 304 (98.68%) changed or added relevant lines in 7 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+1.6%) to 96.257%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/entropic/sinkhorn_barycenter.jl	21	23	91.3%
src/entropic/sinkhorn_epsscaling.jl	40	42	95.24%

Totals
Change from base Build 928044156:	1.6%
Covered Lines:	540
Relevant Lines:	561

💛 - Coveralls

davibarreira · 2021-06-08T13:58:28Z

I apologize in advance for the amount of changes in this PR but I didn't manage to keep it smaller without leaving the package in a half-broken state.

Basically, this PR unifies sinkhorn, sinkhorn_stabilized, and sinkhorn_stabilized_epsscaling with a single sinkhorn(mu, nu, C, eps, algorithm; kwargs...) syntax (the name sinkhorn seems a bit redundant, see comments below). More details:

sinkhorn(mu, nu, C, eps; kwargs...) = sinkhorn(mu, nu, C, eps, SinkhornGibbs(); kwargs...), i.e., without algorithm still the standard Sinkhorn algorithm is used as before

sinkhorn2 allows to specify an algorithm as well (default: SinkhornGibbs()), and hence supports all Sinkhorn variants (and also supports the additional regularization term for all algorithms)

SinkhornEpsilonScaling works both with the regular Sinkhorn algorithm and the stabilized version

SinkhornStabilized supports batches of histograms as well (requires NNlib.batched_mul! for performance and GPU support)

existing methods and keyword arguments are deprecated, i.e., changes should be non-breaking

The name sinkhorn(...) seems a bit redundant since it is already implied by the name of the algorithms (apart from the default choice). However, I did not want to make too many changes in the PR. One approach would be to define a type for entropically regularized OT problems (this could also be used in the SinkhornSolver struct to group together source and target marginals, cost matrix and regularization parameter) and then to use the solve approach (which is already used internally now without depending on CommonSolve). Then one could e.g. still use SinkhornGibbs() as the default algorithm for such problems but also write solve(prob, SinkhornStabilized(); kwargs...) etc. One could even keep sinkhorn(mu, nu, C, eps, alg; kwargs...) = solve(EntropicOTProblem(mu, nu, C, eps), alg; kwargs...).

Great PR! I still have to review the code you submitted, but in terms of design, I have a slightly different proposal related to the "problem" and "solver" paradigm. I'd say that, the "problem" would always be the original Optimal Transport problem (i.e., mu to nu with cost c and no regularization), and the "solver" would incorporate the addition of regularization or whatever it uses. Hence, instead of solve(EntropicOTProblem(mu, nu, C, eps), alg; kwargs...), it would be solve(OTProblem(mu, nu, C), alg; kwargs...). This proposal is more grounded in the theoretical consistency than perhaps programming wise. The idea of using sinkhorn would be (in theory) to approximately solve the original OT. Hence, although it solves a different problem, I'm arguing that it's still better understood as part of the solver.

devmotion · 2021-06-08T14:42:52Z

Interesting, I'll think about your comment. To me the entropically regularized OT problem is really a separate problem and not just induced by the approximation algorithm. Of course, it is derived from the exact OT problem but it always seemed to me a separate entity. Similar to like in other fields regularized problems are considered separately, e.g., least squares problem with Tikhonov regularization, Lasso or elastic net that have specific mathematical properties.

Intuitively, I assume that making the regularization part of the algorithm could lead to problems or inconsistencies when composing different algorithms such as epsilon scaling and different Sinkhorn algorithms - then the regularization would only be a parameter of the subalgorithm but not the main algorithm.

davibarreira · 2021-06-08T14:51:18Z

Interesting, I'll think about your comment. To me the entropically regularized OT problem is really a separate problem and not just induced by the approximation algorithm. Of course, it is derived from the exact OT problem but it always seemed to me a separate entity. Similar to like in other fields regularized problems are considered separately, e.g., least squares problem with Tikhonov regularization, Lasso or elastic net that have specific mathematical properties.

Intuitively, I assume that making the regularization part of the algorithm could lead to problems or inconsistencies when composing different algorithms such as epsilon scaling and different Sinkhorn algorithms - then the regularization would only be a parameter of the subalgorithm but not the main algorithm.

I think this approach you described is actually the more standard way. But the reason why I tend to think like this in OT is that mostly the regularization is added not to penalize somehow the errors, but just make the original problem solvable. The use of sinkhorn instead of the unregularized problem is usually just for computational reasons (at least in ML, it's the only reason people choose it). If we look at a linear regression, the addition of a regularization is not to "facilitate" the solution of the original regression, but just to emphasize a different aspect. So I see a fundamental distinction on the motivation.

But this would be just a way of thinking things. So if it makes coding problematic, I'd say we should just stick with your design. I don't see that much of a gain.

zsteve · 2021-06-08T17:31:04Z

I also am yet to go through the code, but I think about the above comments I'd also lean towards what is implemented currently.
The reason that I am not too keen with the approach of viewing the regularisation as an 'add-on' to unregularised OT is that the scaling algorithms for solution are highly specific, i.e. the choice of regulariser dictates the available algorithms and there's very little scope for mixing and matching regularisers and algorithms.

I think in the future it might be useful to have a more general solver that can deal with arbitrary regulariser, using some gradient methods like L-BFGS, similar to the ot.smooth module in POT.

devmotion · 2021-06-08T20:17:16Z

I'd also lean towards what is implemented currently.

Actually none of the problem types or drastic interface changes is implemented in this PR, I just thought it would be worth considering it since sinkhorn(..., SinkhornStabilized()) etc. seemed a bit redundant and already in the internal solver it seemed natural to group together the other arguments (source and target marginals, cost matrix, and regularization parameter).

If there are problems types for entropically regularized OT, quadratically regularized OT etc. (also a single type that allows to dispatch on the type of regularization would be sufficient), then it would be immediately clear that solve(entropic_prob) should use a default algorithm for solving the entropically regularized OT problem whereas solve(quadratic_prob) should use one for quadratically regularized OT problems. Whereas it is not possible to deduce from the arguments mu, nu, C, and eps the type of the regularization - either one needs specific function names for each problem type (similar to the current approach which however also mixes it with different algorithms) or an additional argument that allows to specify the type of the regularization (but in this case arguably it seems simpler and more direct to indicate this with the problem type).

I think in the future it might be useful to have a more general solver that can deal with arbitrary regulariser, using some gradient methods like L-BFGS, similar to the ot.smooth module in POT.

The nice thing is that one could define for each algorithm to which problem types it can be applied and to show an error message if the users does use an incorrect algorithm for the provided problem. Or one couldn't allow it at all by using a type structure on the algorithms and problems (seems a bit less flexible though). And similarly for some algorithms one could define that they may be applied to OT problems with arbitrary regulariser.

devmotion · 2021-06-17T19:19:13Z

What's the status here? Did you have a look at the changes?

davibarreira · 2021-06-17T20:32:36Z

src/entropic/sinkhorn.jl

+function build_solver(
+    alg::Sinkhorn,
+    μ::AbstractVecOrMat,
+    ν::AbstractVecOrMat,


mu and nu here are actually not measures, but just the support, right? Maybe in a future PR we should call this musupp and nusupp.

It's not the support, it's the histograms (so also not fully specified measures...). The support is implicitly part of the cost matrix.

You are right! You see, that's why we should specify this. I always confuse haha

davibarreira · 2021-06-17T20:34:24Z

src/entropic/sinkhorn.jl

    end

-    return u, v
+    return nothing


return nothing is equivalent to not having a return? Any reason why using one instead of the other?

Yep, it's equivalent. BlueStyle wants it 🙂

davibarreira · 2021-06-17T20:45:34Z

src/entropic/sinkhorn_epsscaling.jl

+```
+where ``\\lambda`` is the scaling factor and ``k`` the number of scaling steps.
+"""
+function SinkhornEpsilonScaling(alg::Sinkhorn; factor::Real=1//2, steps::Int=5)


any advantage in using fractions here?

It does not promote which is relevant e.g. if you want to keep Float32 (otherwise they are promoted to Float64 which leads to terrible performance with GPUs).

davibarreira · 2021-06-17T21:11:01Z

src/entropic/sinkhorn_stabilized.jl

 ```julia
-isapprox(sum(G; dims=2), μ; atol=atol, rtol=rtol, norm=x -> norm(x, 1))
+sinkhorn(
+    SinkhornStabilized(; absorb_tol=absorb_tol), μ, ν, C, ε; kwargs...


is the algorithm the first argument? Isn't it the last? I mean sinkhorn(μ, ν, C, ε, SinkhornStabilized(; absorb_tol=absorb_tol).

Ah yeah, initially I had it as the first argument but then switched it around. Seems I forgot to update the documentation, good catch!

davibarreira · 2021-06-17T21:13:28Z

src/entropic/sinkhorn.jl

-    K = @. exp(-C / ε)
+function sinkhorn(μ, ν, C, ε, alg::Sinkhorn; kwargs...)
+    # build solver
+    solver = build_solver(alg, μ, ν, C, ε; kwargs...)


Wouldn't be better to have the same order of arguments for sinkhorn and build_solver?

Probably, again I assume it was since initially I used a different order in sinkhorn. I'll update it.

davibarreira · 2021-06-17T21:20:29Z

test/entropic/sinkhorn_epsscaling.jl

+        for alg in (SinkhornGibbs(), SinkhornStabilized())
+            # compute optimal transport plan and cost
+            γ = sinkhorn(μ, ν, C, ε, SinkhornEpsilonScaling(alg); maxiter=5_000)
+            c = sinkhorn2(μ, ν, C, ε, SinkhornEpsilonScaling(alg); maxiter=5_000)


I think we should use cost and leave c to be the OT cost function, while C is the cost matrix.

After reading the rest of the code for the tests, I see that this would require a huge amount of refactoring in the tests, so I think it's better to just leave it be.

I think more consistent names are a good idea but I assume it might be better to do this in a separate PR since it might also affect other unrelated parts.

davibarreira · 2021-06-17T21:28:05Z

I don't know much about Sinkhorn other than the regular implementation. So I tried reading the whole code, but I was not very thorough, specially in the unbalanced or the other modifications, such as epsilon scalling.

devmotion · 2021-06-24T15:10:53Z

GPU tests pass now (new CUDA + NNlibCUDA versions fix some bugs in CUDA that caused timeouts when instantiating a new project environment as done in our tests).

devmotion added 3 commits June 8, 2021 14:39

Refactor sinkhorn and sinkhorn2

08dd761

Update documentation

d35a862

Bump version

3b6f21a

github-actions bot reviewed Jun 8, 2021

View reviewed changes

src/utils.jl Outdated Show resolved Hide resolved

test/entropic/sinkhorn_stabilized.jl Outdated Show resolved Hide resolved

test/entropic/sinkhorn_stabilized.jl Outdated Show resolved Hide resolved

devmotion and others added 5 commits June 8, 2021 15:19

Support NNlib 0.6 for Julia 1.0

142fac1

Add NNlibCUDA

48d7125

Fix format

cc2b338

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update documentation

2cb3ab9

Merge branch 'dw/sinkhorn_refactor' of github.com:zsteve/OptimalTrans…

67edbd2

…port.jl into dw/sinkhorn_refactor

Merge branch 'master' into dw/sinkhorn_refactor

b5bd0fc

davibarreira approved these changes Jun 17, 2021

View reviewed changes

devmotion and others added 3 commits June 18, 2021 00:30

Change order of build_solver

adc5e77

Fix docstrings

be2025b

Merge branch 'master' into dw/sinkhorn_refactor

86126e7

devmotion closed this Jun 24, 2021

devmotion reopened this Jun 24, 2021

Trigger build

bc83900

devmotion requested review from davibarreira and zsteve June 24, 2021 15:07

davibarreira approved these changes Jun 24, 2021

View reviewed changes

devmotion mentioned this pull request Jun 27, 2021

Unbalanced OT in OptimalTransport.jl ericphanson/UnbalancedOptimalTransport.jl#9

Open

devmotion merged commit f03b05c into master Jun 27, 2021

devmotion deleted the dw/sinkhorn_refactor branch June 27, 2021 13:38

zsteve mentioned this pull request Jul 27, 2021

sinkhorn_barycenter refactor #110

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `sinkhorn` and `sinkhorn2` #100

Refactor `sinkhorn` and `sinkhorn2` #100

devmotion commented Jun 8, 2021 •

edited

Loading

coveralls commented Jun 8, 2021 •

edited

Loading

davibarreira commented Jun 8, 2021 •

edited

Loading

devmotion commented Jun 8, 2021

davibarreira commented Jun 8, 2021 •

edited

Loading

zsteve commented Jun 8, 2021

devmotion commented Jun 8, 2021

devmotion commented Jun 17, 2021

davibarreira Jun 17, 2021

devmotion Jun 17, 2021

davibarreira Jun 17, 2021

davibarreira Jun 17, 2021

devmotion Jun 17, 2021

davibarreira Jun 17, 2021

devmotion Jun 17, 2021

davibarreira Jun 17, 2021

devmotion Jun 17, 2021

devmotion Jun 17, 2021

davibarreira Jun 17, 2021

devmotion Jun 17, 2021

devmotion Jun 17, 2021

davibarreira Jun 17, 2021

davibarreira Jun 17, 2021

devmotion Jun 17, 2021

davibarreira commented Jun 17, 2021

devmotion commented Jun 24, 2021

Refactor sinkhorn and sinkhorn2 #100

Refactor sinkhorn and sinkhorn2 #100

Conversation

devmotion commented Jun 8, 2021 • edited Loading

coveralls commented Jun 8, 2021 • edited Loading

Pull Request Test Coverage Report for Build 928044999

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

davibarreira commented Jun 8, 2021 • edited Loading

devmotion commented Jun 8, 2021

davibarreira commented Jun 8, 2021 • edited Loading

zsteve commented Jun 8, 2021

devmotion commented Jun 8, 2021

devmotion commented Jun 17, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davibarreira commented Jun 17, 2021

devmotion commented Jun 24, 2021

Refactor `sinkhorn` and `sinkhorn2` #100

Refactor `sinkhorn` and `sinkhorn2` #100

devmotion commented Jun 8, 2021 •

edited

Loading

coveralls commented Jun 8, 2021 •

edited

Loading

davibarreira commented Jun 8, 2021 •

edited

Loading

davibarreira commented Jun 8, 2021 •

edited

Loading