diff --git a/_literate/4_Turing.jl b/_literate/4_Turing.jl index 3e88a87b..03cdd370 100644 --- a/_literate/4_Turing.jl +++ b/_literate/4_Turing.jl @@ -1,28 +1,28 @@ # # How to use Turing -# [**Turing**](http://turing.ml/) is a ecosystem of Julia packages for Bayesian Inference using +# [**Turing**](http://turing.ml/) is an ecosystem of Julia packages for Bayesian Inference using # [probabilistic programming](https://en.wikipedia.org/wiki/Probabilistic_programming). -# Turing provide an easy an intuitive way of specifying models. +# Turing provides an easy and intuitive way of specifying models. # ## Probabilistic Programming -# What is a **probabilistic programming** (PP)? It is a **programming paradigm** in which probabilistic models +# What is **probabilistic programming** (PP)? It is a **programming paradigm** in which probabilistic models # are specified and inference for these models is performed **automatically** (Hardesty, 2015). In more clear terms, # PP and PP Languages (PPLs) allows us to specify **variables as random variables** (like Normal, Binominal etc.) with # **known or unknown parameters**. Then, we **construct a model** using these variables by specifying how the variables # related to each other, and finally **automatic inference of the variables' unknown parameters** is then performed. -# In a Bayesian approach this means specifying **priors**, **likelihoods** and let the PPL compute the **posterior**. +# In a Bayesian approach this means specifying **priors**, **likelihoods** and letting the PPL compute the **posterior**. # Since the denominator in the posterior is often intractable, we use Markov Chain Monte Carlo[^MCMC] and some fancy # algorithm that uses the posterior geometry to guide the MCMC proposal using Hamiltonian dynamics called # Hamiltonian Monte Carlo (HMC) to approximate the posterior. This involves, besides a suitable PPL, automatic differentiation, # MCMC chains interface, and also an efficient HMC algorithm implementation. In order to provide all of these features, -# Turing has a whole ecosystem to address each and everyone of these components. +# Turing has a whole ecosystem to address each and every one of these components. # ## Turing's Ecosystem -# Before we dive into how to specify models in Turing. Let's discuss Turing's **ecosystem**. -# We have several Julia packages under the Turing's GitHub organization [TuringLang](https://github.com/TuringLang), +# Before we dive into how to specify models in Turing, let's discuss Turing's **ecosystem**. +# We have several Julia packages under Turing's GitHub organization [TuringLang](https://github.com/TuringLang), # but I will focus on 6 of those: # * [`Turing.jl`](https://github.com/TuringLang/Turing.jl) @@ -54,13 +54,13 @@ # [`ForwardDiff.jl`](https://github.com/JuliaDiff/ForwardDiff.jl) and [`ReverseDiff.jl`](https://github.com/JuliaDiff/ReverseDiff.jl). # The main goal of `DistributionsAD.jl` is to make the output of `logpdf` differentiable with respect to all continuous parameters # of a distribution as well as the random variable in the case of continuous distributions. This is the package that guarantees the -# "automatical inference" part of the definition of a PPL. +# "automatic inference" part of the definition of a PPL. # Finally, [`Bijectors.jl`](https://github.com/TuringLang/Bijectors.jl) implements a set of functions for transforming constrained # random variables (e.g. simplexes, intervals) to Euclidean space. Note that `Bijectors.jl` is still a work-in-progress and # in the future we'll have better implementation for more constraints, *e.g.* positive ordered vectors of random variables. -# Most of the time we will not be dealing with neither of these packages directly, since `Turing.jl` will take care of the interfacing +# Most of the time we will not be dealing with these packages directly, since `Turing.jl` will take care of the interfacing # for us. So let's talk about `Turing.jl`. # ## `Turing.jl` @@ -89,6 +89,7 @@ using Plots, StatsPlots, Distributions, LaTeXStrings dice = DiscreteUniform(1, 6) plot(dice, label="six-sided Dice", + markershape=:circle, ms=5, xlabel=L"\theta", ylabel="Mass", @@ -117,7 +118,7 @@ end; # Here we are using the [Dirichlet distribution](https://en.wikipedia.org/wiki/Dirichlet_distribution) which # is the multivariate generalization of the [Beta distribution](https://en.wikipedia.org/wiki/Beta_distribution). -# The Dirichlet distribution is often used as the conjugate prior for Categorical or Multinomial distributions. Since our dice +# The Dirichlet distribution is often used as the conjugate prior for Categorical or Multinomial distributions. Our dice # is modelled as a [Categorical distribution](https://en.wikipedia.org/wiki/Categorical_distribution) # with six possible results $y \in \{ 1, 2, 3, 4, 5, 6 \}$ with some probability vector # $\mathbf{p} = (p_1, \dots, p_6)$. Since all mutually exclusive outcomes must sum up to 1 to be a valid probability, we impose the constraint that @@ -133,7 +134,7 @@ sum(mean(Dirichlet(6, 1))) # Also, since the outcome of a [Categorical distribution](https://en.wikipedia.org/wiki/Categorical_distribution) is an integer # and `y` is a $N$-dimensional vector of integers we need to apply some sort of broadcasting here. -# `filldist()` is a nice Turing's function which takes any univariate or multivariate distribution and returns another distribution that repeats the input distribution. +# `filldist()` is a nice Turing function which takes any univariate or multivariate distribution and returns another distribution that repeats the input distribution. # We could also use the familiar dot `.` broadcasting operator in Julia: # `y .~ Categorical(p)` to signal that all elements of `y` are distributed as a Categorical distribution. # But doing that does not allow us to do predictive checks (more on this below). So, instead we use `filldist()`. @@ -156,7 +157,7 @@ first(data, 5) model = dice_throw(data); -# Next, we call the Turing's `sample()` function that takes a Turing model as a first argument, along with a +# Next, we call Turing's `sample()` function that takes a Turing model as a first argument, along with a # sampler as the second argument, and the third argument is the number of iterations. Here, I will use the `NUTS()` sampler from # `AdvancedHMC.jl` and 2,000 iterations. Please note that, as default, Turing samplers will discard the first half of iterations as # warmup. So the sampler will output 1,000 samples (`floor(2_000 / 2)`): @@ -180,7 +181,7 @@ summaries sum(summaries[:, :mean]) -# In the future if you have some crazy huge models and you just want a **subset** of parameters from my chains? +# In the future if you have some crazy huge models and you just want a **subset** of parameters from your chains? # Just do `group(chain, :parameter)` or index with `chain[:, 1:6, :]`: summarystats(chain[:, 1:3, :]) @@ -220,8 +221,8 @@ savefig(joinpath(@OUTPUT, "chain.svg")); # hide # Predictive checks are a great way to **validate a model**. # The idea is to **generate data from the model** using **parameters from draws from the prior or posterior**. -# **Prior predictive check** is when we simulate data using model's parameters values drawn fom the **prior** distribution, -# and **posterior predictive check** is is when we simulate data using model's parameters values drawn fom the **posterior** +# **Prior predictive check** is when we simulate data using model parameter values drawn fom the **prior** distribution, +# and **posterior predictive check** is is when we simulate data using model parameter values drawn fom the **posterior** # distribution. # The workflow we do when specifying and sampling Bayesian models is not linear or acyclic (Gelman et al., 2020). This means