Skip to content

Commit

Permalink
documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
msainsburydale committed Feb 17, 2025
1 parent b0f97fc commit b77f329
Show file tree
Hide file tree
Showing 6 changed files with 308 additions and 287 deletions.
2 changes: 1 addition & 1 deletion docs/src/API/architectures.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Architectures

In principle, any [`Flux`](https://fluxml.ai/Flux.jl/stable/) model can be used to construct the neural network. To integrate it into the workflow, one need only define a method that transforms $K$-dimensional vectors of data sets into matrices with $K$ columns, where the number of rows corresponds to the dimensionality of the output spaces listed in the [Overview](@ref).
In principle, any [`Flux`](https://fluxml.ai/Flux.jl/stable/) model can be used to construct the neural network (see the [Gridded data](@ref) example). To integrate it into the workflow, one need only define a method that transforms $K$-dimensional vectors of data sets into matrices with $K$ columns, where the number of rows corresponds to the dimensionality of the output spaces listed in the [Overview](@ref).

## Modules

Expand Down
36 changes: 27 additions & 9 deletions docs/src/workflow/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ function simulate(parameters::Parameters, m = 1)
end
```

A possible architecture is as follows. Note that deeper architectures that employ residual connections (see [ResidualBlock](@ref)) often lead to improved performance, and certain pooling layers (e.g., [GlobalMeanPool](https://fluxml.ai/Flux.jl/stable/reference/models/layers/#Flux.GlobalMeanPool)) allow the neural network to accommodate grids of varying dimension; for further discussion and an illustration, see [Sainsbury-Dale et al. (2025, Sec. S3, S4)](https://doi.org/10.48550/arXiv.2501.04330).
A possible neural-network architecture is as follows. Note that deeper architectures that employ residual connections (see [ResidualBlock](@ref)) often lead to improved performance, and certain pooling layers (e.g., [GlobalMeanPool](https://fluxml.ai/Flux.jl/stable/reference/models/layers/#Flux.GlobalMeanPool)) allow the neural network to accommodate grids of varying dimension; for further discussion and an illustration, see [Sainsbury-Dale et al. (2025, Sec. S3, S4)](https://doi.org/10.48550/arXiv.2501.04330).

```julia
# Inner network
Expand All @@ -217,6 +217,25 @@ A possible architecture is as follows. Note that deeper architectures that emplo
network = DeepSet(ψ, ϕ)
```

Above, we embedded our CNN within the DeepSets framework to accommodate scenarios involving replicated spatial data (e.g., when fitting models for spatial extremes). However, as noted in Step 4 of the [Overview](@ref), the package allows users to define the neural network using any Flux model. Since this example does not include independent replicates, the following CNN model is equivalent to the DeepSets architecture used above:

```julia
struct CNN{T <: Chain}
chain::T
end
function (cnn::CNN)(Z)
cnn.chain(stackarrays(Z))
end
network = CNN(Chain(
Conv((3, 3), 1 => 32, relu),
MaxPool((2, 2)),Conv((3, 3), 32 => 64, relu),
MaxPool((2, 2)),
Flux.flatten,
Dense(256, 64, relu),
Dense(64, 1)
))
```

Next, we initialise a point estimator and a posterior credible-interval estimator:

```julia
Expand All @@ -227,12 +246,11 @@ q̂ = IntervalEstimator(network)
Now we train the estimators, here using fixed parameter instances to avoid repeated Cholesky factorisations (see [Storing expensive intermediate objects for data simulation](@ref) and [On-the-fly and just-in-time simulation](@ref) for further discussion):

```julia
K = 10000 # number of training parameter vectors
m = 1 # number of independent replicates in each data set
K = 10000 # number of training parameter vectors
θ_train = sample(K)
θ_val = sample(K ÷ 10)
θ̂ = train(θ̂, θ_train, θ_val, simulate, m = m)
= train(q̂, θ_train, θ_val, simulate, m = m)
θ_val = sample(K ÷ 10)
θ̂ = train(θ̂, θ_train, θ_val, simulate)
= train(q̂, θ_train, θ_val, simulate)
```

Once the estimators have been trained, we assess them using empirical simulation-based methods:
Expand All @@ -253,10 +271,10 @@ plot(assessment)
Finally, we can apply our estimators to observed data:

```julia
θ = sample(1) # true parameter
θ = Parameters(Matrix([0.1]')) # true parameter
Z = simulate(θ) # "observed" data
estimate(θ̂, Z) # point estimate
interval(q̂, Z) # 95% marginal posterior credible intervals
estimate(θ̂, Z) # point estimate: 0.11
interval(q̂, Z) # 95% marginal posterior credible interval: [0.08, 0.16]
```

Note that missing data (e.g., due to cloud cover) can be accommodated using the [missing-data methods](@ref "Missing data") implemented in the package.
Expand Down
2 changes: 1 addition & 1 deletion docs/src/workflow/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Neural inferential methods have marked practical appeal, as their implementation
* For neural posterior estimators, the neural network is a mapping $\mathcal{Z}\to\mathcal{K}$, where $\mathcal{K}$ denotes the space of the approximate-distribution parameters $\boldsymbol{\kappa}$.
* For neural ratio estimators, the neural network is a mapping $\mathcal{Z}\times\Theta\to\mathbb{R}$.

Any [Flux](https://fluxml.ai/Flux.jl/stable/) model can be used to construct the neural network. To integrate it into the workflow, one need only define a method that transforms $K$-dimensional vectors of data (see Step 2 above) into matrices with $K$ columns, where the number of rows corresponds to the dimensionality of the output spaces listed above. The type [`DeepSet`](@ref) serves as a convenient wrapper for embedding standard neural networks (e.g., MLPs, CNNs, GNNs) in a framework for making inference with an arbitrary number of independent replicates, and it comes with pre-defined methods for handling the transformations from a $K$-dimensional vector of data to a matrix output.
Any [Flux](https://fluxml.ai/Flux.jl/stable/) model can be used to construct the neural network. To integrate it into the workflow, one need only define a method that transforms $K$-dimensional vectors of data (see Step 2 above) into matrices with $K$ columns, where the number of rows corresponds to the dimensionality of the output spaces listed above (see the [Gridded data](@ref) example). The type [`DeepSet`](@ref) serves as a convenient wrapper for embedding standard neural networks (e.g., MLPs, CNNs, GNNs) in a framework for making inference with an arbitrary number of independent replicates, and it comes with pre-defined methods for handling the transformations from a $K$-dimensional vector of data to a matrix output.
1. Wrap the neural network (and possibly the approximate distribution) in a [subtype of `NeuralEstimator`](@ref "Estimators") corresponding to the intended inferential method:
* For neural Bayes estimators under general, user-defined loss functions, use [`PointEstimator`](@ref);
* For neural posterior estimators, use [`PosteriorEstimator`](@ref);
Expand Down
4 changes: 2 additions & 2 deletions src/Architectures.jl
Original file line number Diff line number Diff line change
Expand Up @@ -124,10 +124,10 @@ X = [rand32(dₓ) for _ ∈ eachindex(Z)]
ds((Z, X))
```
"""
struct DeepSet{T, G, K}
struct DeepSet{T, G, K, A}
ψ::T
ϕ::G
a::ElementwiseAggregator
a::A
S::K
end
function DeepSet(ψ, ϕ, a::Function = mean; S = nothing)
Expand Down
1 change: 0 additions & 1 deletion src/NeuralEstimators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,6 @@ end
# - Functionality: assess(estimator::PosteriorEstimator) and assess(estimator::RatioEstimator) and corresponding diagnostics.
# - Functionality: Incorporate the following package (possibly as an extension) to expand bootstrap functionality; https://github.com/juliangehring/Bootstrap.jl. Note also the "straps()" method that allows one to obtain the bootstrap distribution. I think what I can do is define a method of interval(bs::BootstrapSample). Maybe one difficulty will be how to re-sample... Not sure how the bootstrap method will know to sample from the independent replicates dimension (the last dimension) of each array.
# - Functionality: Training, option to check validation risk (and save the optimal estimator) more frequently than the end of each epoch, which would avoid wasted computation when we have very large training sets.
# - Functionality: Helper functions for censored data.
# - Functionality: Explicit learning of summary statistics.
# - Polishing: Might be better to use Plots rather than {AlgebraOfGraphics, CairoMakie}.
# - Add NeuralEstimators.jl to the list of packages that use Documenter: see https://documenter.juliadocs.org/stable/man/examples/
Expand Down
Loading

0 comments on commit b77f329

Please sign in to comment.