Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example: variational problems using autodiff #122

Merged
merged 7 commits into from
Aug 27, 2021
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ if haskey(ENV, "GITHUB_ACTIONS")
ENV["JULIA_DEBUG"] = "Documenter"
end

const EXAMPLES = ("basic",)
const EXAMPLES = ("basic", "variational")
const INPUT = joinpath(@__DIR__, "..", "examples")
const OUTPUT = joinpath(@__DIR__, "src", "examples")

Expand Down
6 changes: 6 additions & 0 deletions examples/variational/Manifest.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# This file is machine-generated - editing it directly is not advised

[[Suppressor]]
git-tree-sha1 = "a819d77f31f83e5792a76081eee1ea6342ab8787"
uuid = "fd094767-a336-5f1f-9728-57cf17d0bbfb"
version = "0.2.0"
2 changes: 2 additions & 0 deletions examples/variational/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[deps]
Suppressor = "fd094767-a336-5f1f-9728-57cf17d0bbfb"
zsteve marked this conversation as resolved.
Show resolved Hide resolved
154 changes: 154 additions & 0 deletions examples/variational/script.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# # Variational problems
#
# In this example, we will numerically simulate an entropy-regularised Wasserstein gradient flow
# approximating the Fokker-Planck and porous medium equations.
#
# The connection between Wasserstein gradient flows and (non)-linear PDEs is due to Jordan, Kinderlehrer and Otto [^JKO98].
#
# [^JKO98]: Jordan, Richard, David Kinderlehrer, and Felix Otto. "The variational formulation of the Fokker--Planck equation." SIAM journal on mathematical analysis 29.1 (1998): 1-17.
#
# ## Fokker-Planck equation as a $W_2$ gradient flow
# For a potential function $\Psi$ and noise level $\sigma^2$, the Fokker-Planck equation (FPE) is
# ```math
# \partial_t \rho_t = \nabla \cdot (\rho_t \nabla \Psi) + \frac{\sigma^2}{2} \Delta \rho_t,
# ```
# and we take no-flux (Neumann) boundary conditions.
#
# This describes the evolution of a massless particle undergoing both diffusion (with diffusivity $\sigma^2$) and drift (along potential $\Psi$) according to the stochastic differential equation
# ```math
# dX_t = -\nabla \Psi(X_t) dt + \sigma dB_t.
# ```
# The result of Jordan, Kinderlehrer and Otto (commonly referred to as the JKO theorem) states that
# $\rho$ evolves following the 2-Wasserstein gradient of the Gibbs free energy functional
# ```math
# F(\rho) = \int \Psi d\rho + \int \log(\rho) d\rho.
# ```
#
# ## Proximal schemes for gradient flows
# In an Euclidean space, the gradient flow of a functional $F$ is simply an ordinary differential equation
# ```math
# \dfrac{dx(t)}{dt} = -\nabla F(x(t)).
# ```
# Of course, there is an implicit requirement that $F$ is smooth. A more general formulation of a gradient flow that allows
# $F$ to be non-smooth is
# ```math
# x_{t+\tau} = \operatorname{argmin}_x \frac{1}{2} \| x - x_t \|_2^2 + \tau F(x).
# ```
# As the timestep $\tau$ shrinks, $x_t$ becomes a better and better approximation to the gradient flow of $F$.
#
# ## Wasserstein gradient flow
# In the context of the JKO theorem, we seek $\rho_t$ that is the gradient flow of $F$ with
# respect to the 2-Wasserstein distance. This can be achieved by choosing the $W_2$ metric in the proximal step:
# ```math
# \rho_{t + \tau} = \operatorname{argmin}_{\rho} d_{W_2}^2(\rho_{t}, \rho) + \tau F(\rho).
# ```
# Finally, a numerical scheme for computing this gradient flow can be developed by using the entropic regularisation
# of optimal transport
# ```math
# \rho_{t + \tau} = \operatorname{argmin}_{\rho} \operatorname{OT}_\varepsilon(\rho_{t}, \rho) + \tau F(\rho),
# ```
# where
# ```math
# \operatorname{OT}_\varepsilon(\alpha, \beta) = \min_{\gamma \in \Pi(\alpha, \beta)} \sum_{i,j} \frac{1}{2} \| x_i - x_j \|_2^2 \gamma(x_i, x_j) + \varepsilon \sum_{i, j} \gamma_{ij} \log(\gamma_{ij}).
# ```
# Each step of this problem is a minimisation problem with respect to $\rho$.
# Since we use entropic optimal transport which is differentiable, this can be solved using gradient-based methods.

# ## Problem setup
#
using OptimalTransport
using StatsBase, Distances
zsteve marked this conversation as resolved.
Show resolved Hide resolved
using ReverseDiff, Optim, LinearAlgebra
using Plots
using BenchmarkTools
zsteve marked this conversation as resolved.
Show resolved Hide resolved
using LogExpFunctions
using LaTeXStrings
using Suppressor

# Here, we set up the computational domain that we work on - we discretize the interval $[-1, 1]$.
# The natural boundary conditions to use will be Neumann (zero flux), see e.g. [^Santam2017]
#
# [^Santam2017]: Santambrogio, Filippo. "{Euclidean, metric, and Wasserstein} gradient flows: an overview." Bulletin of Mathematical Sciences 7.1 (2017): 87-154.

support = LinRange(-1, 1, 64)
zsteve marked this conversation as resolved.
Show resolved Hide resolved
C = pairwise(SqEuclidean(), support');
zsteve marked this conversation as resolved.
Show resolved Hide resolved

# Now we set up various functionals that we will use.
#
# Generalised entropy (Equation (4.4) of [^Peyre2015]): for $m = 1$ this is just the "regular" entropy, and $m = 2$ this is squared $L_2$.
#
# [^Peyre2015]: Peyré, Gabriel. "Entropic approximation of Wasserstein gradient flows." SIAM Journal on Imaging Sciences 8.4 (2015): 2323-2351.
function E(ρ; m = 1)
zsteve marked this conversation as resolved.
Show resolved Hide resolved
if m == 1
return sum(xlogx.(ρ)) - sum(ρ)
elseif m > 1
return dot(ρ, @. (ρ^(m-1) - m)/(m-1))
zsteve marked this conversation as resolved.
Show resolved Hide resolved
end
end;
zsteve marked this conversation as resolved.
Show resolved Hide resolved

# Now define $V(x)$ to be a potential energy function that has two potential wells at $x = ± 0.5$.
V(x) = 10*(x-0.5)^2*(x+0.5)^2;
plot(support, V.(support); color = "black", label = "Scalar potential")
zsteve marked this conversation as resolved.
Show resolved Hide resolved

# Having defined $V$, this induces a potential energy functional $\Psi$ on probability distributions $\rho$:
# ```math
# \Psi(\rho) = \int V(x) \rho(x) dx = \langle \psi, \rho \rangle .
# ```
Ψ = V.(support);

# Define the time step $\tau$ and entropic regularisation level $\varepsilon$, and form the associated Gibbs kernel $K = e^{-C/\varepsilon}$.
τ = 0.05
ε = 0.01
K = @. exp(-C/ε)
zsteve marked this conversation as resolved.
Show resolved Hide resolved

# We define the (non-smooth) initial condition $\rho_0$ in terms of step functions.
H(x) = x > 0
ρ0 = @. H(support + 0.25) - H(support - 0.25)
ρ0 = ρ0/sum(ρ0)
plot(support, ρ0; label = "Initial condition ρ0", color = "blue")
zsteve marked this conversation as resolved.
Show resolved Hide resolved

# `G_fpe` is the objective function for the proximal step
# ```math
# G_\mathrm{fp}(\rho) = \operatorname{OT}_\varepsilon(\rho_{t}, \rho) + \tau F(\rho),
# ```
# and we seek to minimise in $\rho$.
G_fpe(ρ, ρ0, τ, ε, C) = sinkhorn2(ρ, ρ0, C, ε; regularization = true, maxiter = 250) + τ*( dot(Ψ, ρ) + E(ρ) );
zsteve marked this conversation as resolved.
Show resolved Hide resolved

# `step` solves the proximal problem to produce $\rho_{t + \tau}$ from $\rho_t$.
function step(ρ0, τ, ε, C, G)
opt = optimize(u -> G(softmax(u), ρ0, τ, ε, C), ones(size(ρ0)), LBFGS(), Optim.Options(iterations = 50, g_tol = 1e-6); autodiff = :forward)
zsteve marked this conversation as resolved.
Show resolved Hide resolved
return softmax(Optim.minimizer(opt))
end
# Now we simulate `N = 10` iterates of the gradient flow and plot the result.
N = 10
zsteve marked this conversation as resolved.
Show resolved Hide resolved
ρ = similar(ρ0, size(ρ0, 1), N)
ρ[:, 1] = ρ0
@suppress begin
for i = 2:N
ρ[:, i] = step(ρ[:, i-1], τ, ε, C, G_fpe)
zsteve marked this conversation as resolved.
Show resolved Hide resolved
end
end
colors = range(colorant"red", stop = colorant"blue", length = N)
plot(support, ρ, title = L"F(\rho) = \langle \psi, \rho \rangle + \langle \rho, \log(\rho) \rangle"; palette = colors, legend = nothing)
zsteve marked this conversation as resolved.
Show resolved Hide resolved

# ## Porous medium equation
#
# The porous medium equation (PME) is the nonlinear PDE
# ```math
# \partial_t \rho = \nabla \cdot (\rho \nabla \Psi) + \Delta \rho^m,
# ```
# again with Neumann boundary conditions. The value of $m$ in the PME corresponds to picking $m$ in the generalised entropy functional.
# Now, we will solve the PME with $m = 2$ as a Wasserstein gradient flow.
#
G_pme(ρ, ρ0, τ, ε, C) = sinkhorn2(ρ, ρ0, C, ε; regularization = true, maxiter = 250) + τ*(dot(Ψ, ρ) + E(ρ; m = 2));
zsteve marked this conversation as resolved.
Show resolved Hide resolved

# set up as previously
N = 10
zsteve marked this conversation as resolved.
Show resolved Hide resolved
ρ = similar(ρ0, size(ρ0, 1), N)
ρ[:, 1] = ρ0
@suppress begin
for i = 2:N
ρ[:, i] = step(ρ[:, i-1], τ, ε, C, G_pme)
zsteve marked this conversation as resolved.
Show resolved Hide resolved
end
end
plot(support, ρ, title = L"F(\rho) = \langle \psi, \rho \rangle + \langle \rho, \rho - 1\rangle"; palette = colors, legend = nothing)
zsteve marked this conversation as resolved.
Show resolved Hide resolved