@measure combinator #7

cscherrer · 2020-09-23T03:02:07Z

We already have a start on this, but I think we can do better. Something like

@measure <name> <absolute continuity relation> [scale] <base measure>

Maybe most confusing here are that

absolute continuity relation can be ≪ or ≃
scale is optional

For example,

@measure Normal ≃ (twoπ/sqrt2) Lebesgue

should generate (something like)

struct Normal{P, X} <: AbstractMeasure{X}
    par::P
end

function Normal(nt::NamedTuple)
    P = typeof(nt)
    return Normal{P, eltype(Normal{P})}
end
    
Normal(; kwargs...) = Normal((; kwargs...))
    
baseMeasure(μ::Normal{P, X}) where {P, X} = (twoπ/sqrt2)*Lebesgue(X)
            
:≪(::Normal{P, X}, ::Lebesgue{X}) where {P, X}) = true

# This method is only added if user specifies ≃
:≪(::Lebesgue{P, X}, ::Normal{X}) where {P, X}) = true

We'd then specify the logdensity as

function logdensity(d::Normal{P} , x::X) where {P <: NamedTuple{(:μ, :σ)}, X}    
    return - log(d.par.σ)  - (x - d.par.μ)^2 / (2 * d.par.σ^2)
end

# Standard normal

function logdensity(d::Normal{EmptyNamedTuple,X} , x::X) where {X}    
    return - x^2 / 2 
end

This definition (with no base measure) would be wrt the default base measure, here (twoπ/sqrt2)*Lebesgue(X).

We'll also have a method logdensity(μ::Measure, ν::Measure) that returns the Radon-Nikodym derivative as a function, assuming μ≪ν, and fails otherwise.

The text was updated successfully, but these errors were encountered:

cscherrer · 2020-09-23T13:19:10Z

cc: @mschauer

cscherrer · 2020-09-24T16:49:59Z

Here's where we are currently:

julia> (@macroexpand @measure Normal(μ,σ) ≃ Lebesgue{X}) |> MacroTools.prettify
quote
    struct Normal{P, X} <: AbstractMeasure{X}
        par::P
    end
    function Normal(nt::NamedTuple)
        P = typeof(nt)
        return Normal{P, eltype(Normal{P})}(nt)
    end
    Normal(; kwargs...) = Normal((; kwargs...))
    (baseMeasure(μ::Normal{P, X}) where {P, X}) = Lebesgue{X}
    Normal(μ, σ) = Normal(; μ, σ)
    ((:≪)(::Normal{P, X}, ::Lebesgue{X}) where {P, X}) = true
    ((:≪)(::Lebesgue{X}, ::Normal{P, X}) where {P, X}) = true
end

Let's step through this:

struct Normal{P, X} <: AbstractMeasure{X}
    par::P
end

This allows Normal to have different parameterizations. X is the return value. The supertype AbstractMeasure{X} is for convenience of adding new methods, and should not be restrictive.

function Normal(nt::NamedTuple)
    P = typeof(nt)
    return Normal{P, eltype(Normal{P})}(nt)
end

When a named tuple is given as an argument, the resulting type should include the type of that named tuple. Note that there should be a functional type dependency P -> X, so the user will need to provide the desired eltype method.

Normal(; kwargs...) = Normal((; kwargs...))

Passing keyword arguments dispatches to the appropriate named tuple.

(baseMeasure(μ::Normal{P, X}) where {P, X}) = Lebesgue{X}

Each measure should have a baseMeasure. Whenever possible, this should depend only on the type.

Normal(μ, σ) = Normal(; μ, σ)

This method corresponds to the default parameters given to the macro. As it is it will only work (I think) in Julia 1.5+. We need to change the right side of this to Normal(; μ=μ, σ=σ).

((:≪)(::Normal{P, X}, ::Lebesgue{X}) where {P, X}) = true
((:≪)(::Lebesgue{X}, ::Normal{P, X}) where {P, X}) = true

The first of these methods will always be specified, and just says that the measure is absolutely continuous wrt its base measure.

The second is only added because in the macro we specified ≃. If the new measure has a smaller support than its base measure we should instead pass ≪.

cscherrer · 2020-09-24T18:37:01Z

Hmm, there are some complications....

In long-running inner loops, we shouldn't have to mess with the constant. So I
think Normal is better defined as

Normal(μ,σ) ≃ (1/sqrt2π) * Lebesgue(X)

This creates a ScaledMeasure that will "know" it's ≃ Lebesgue(X). When we
need the scale, we would ask for the density wrt Lebesgue(X). So base
measures really serve two purposes:

To specify the measure with respect to which the density is defined
To provide a reference point for ≪

We could consider making separating these, so a base measure is for specifying the density, and a "primitive" measure is given that's equivalent (≃) to the base measure, but may be simpler (for example, Lebesgue(X) vs k*Lebesgue(X))

In general, say the user asks for the density of μ wrt ν, for arbitrary
measures. We need to quickly determine if there's a chain of ≪ relations
between the two. If so, the densities should also be defined (we'll need to have
this invariant).

This may be undecidable, or at least very tricky. So I suggest something like

Let a = baseMeasure(μ) and b = baseMeasure(ν).
We already know that μ ≪ a.
We need to check that a ≪ b. The number of base measures should be "small".
If no method is found, we... throw an informative error? I guess? Important
thing here is to recognize that it may be very common for methods to be missing,
so the user will need to be informed of this.
Finally, we'll need to check b ≪ ν. If this is true, it should be specified
when ν is defined.

So I think only (3) is the challenge.

femtomc · 2020-09-25T09:19:26Z

I’m having a tough time seeing how (3) can be determined automatically. Is the expectation that the user will “assert” (3) for measures they define? This seems consistent with what we’ve discussed.

The picture I’m getting here: we already know how to define distributions with reference to base measures, and we can methodically go through and fill out the AC relation - then offer an interface which allows the user to assert that their measure is AC.

mschauer · 2020-09-25T09:25:28Z

Yes, almost just as setting a fallback

(:≪)(a, b) = isfinite(kullbackleibler(a, b))

and hoping that people fill the method tables for (:≪)(a, b) and kullbackleibler(a, b)

phipsgabler · 2020-09-25T09:45:22Z

I have the feeling that this has been discussed somewhere before, but isn't

Normal(; kwargs...) = Normal((; kwargs...))

a bit problematic, in the sense that the parametrization is different for different orderings of the P named tuple? That might be not so obvious for someone using only keyword arguments, where you usually don't expect them having to be ordered.

Maybe there's a nice way to ensure a canonical ordering on NamedTuples?

simeonschaub · 2020-09-25T10:53:23Z

Maybe there's a nice way to ensure a canonical ordering on NamedTuples?

Perhaps something like this?

function canonicalize(nt::NamedTuple{_k,T}) where {_k,T}
    if @generated
        k = collect(_k)
        sp = sortperm(k)
        return quote
            sp = ($(sp...),)
            keys = ($(Meta.quot.(k[sp])...),)
            types = Tuple{$(T.parameters[sp]...)}
            NamedTuple{keys,types}(map(Base.Fix1(getfield, values(nt)), sp))
        end
    else
        sp = sortperm(collect(keys(nt)))
        return NamedTuple{keys(nt)[sp]}(values(nt)[sp])
    end
end

mschauer · 2020-09-25T12:11:53Z

The internal representation of the parameters as named tuple isn’t something measures.jl enforces or even strongly suggests. It is just one Way of doing it. Do I get this
right?

phipsgabler · 2020-09-25T12:22:52Z

I think it is only implicit, in the sense that the kwargs constructor which @measure generates uses it. But that doesn't have to be so. Maybe someone has already implemented an "unordered record" type that can be reused? It sounds like a repeating problem in libraries.

Although imposing any special type of course goes against the spirit to impose as little as possible.

cscherrer · 2020-09-25T12:44:29Z

@femtomc :

The picture I’m getting here: we already know how to define distributions with reference to base measures, and we can methodically go through and fill out the AC relation - then offer an interface which allows the user to assert that their measure is AC.

I think we're on the same page. One fine point is, I wouldn't suggest filling out the AC relation before it's needed. I assume the overhead of that would bog things down (but I might be wrong about that).

I was thinking the base-to-base approach, together with a way to let the user know what method is missing if it fails. And user could also add methods returning false. We could make ≪ a generated function, but let's try the simpler approach first.

@mschauer :

(:≪)(a, b) = isfinite(kullbackleibler(a, b))

I don't know about the direction here. To me ≪ seems more "primitive". KL might be more expensive to compute, and we may not even know how.

Oh, but... if they do know how to compute it for some type-constrained a and b, they could define this method. Or maybe there's a trait for types with a defined KL, or something like that.

@phipsgabler

Maybe there's a nice way to ensure a canonical ordering on NamedTuples?

Great point!

@simeonschaub

Perhaps something like this?

This looks great! I've mostly swept this issue under the rug out of concern for the overhead of sorting each time. But making the sortperm a generated function is a clever way around that. I'd maybe call it sortkeys, that's clearer to me. This could be a great addition to NamedTupleTools.jl.

@mschauer

The internal representation of the parameters as named tuple isn’t something measures.jl enforces or even strongly suggests. It is just one Way of doing it. Do I get this right?

That's right, though I expect it would be a very common one (hence the convenience macro). But yes, users can parameterize by and P. You could even do @measure Normal(μ,σ) ≃ (1/sqrt2π) * Lebesgue(X) and then add a method for parameterization P <: AbstractArray, where [a,b] maps to [μ,exp(σ)], and use that directly for HMC.

cscherrer · 2020-09-25T13:29:56Z

@phipsgabler @simeonschaub I just remembered @simonbyrne had mentioned something similar, just found his KeywordDispatch.jl. We should see how this compares before committing to an approach.

phipsgabler · 2020-09-25T13:37:03Z

I think the ideas are rather similar :D

cscherrer · 2020-09-26T15:14:24Z

Closing in favor of #9

cscherrer mentioned this issue Sep 26, 2020

Canonical ordering for NamedTuple keys #12

Open

cscherrer closed this as completed Sep 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

@measure combinator #7

@measure combinator #7

cscherrer commented Sep 23, 2020 •

edited

Loading

cscherrer commented Sep 23, 2020

cscherrer commented Sep 24, 2020

cscherrer commented Sep 24, 2020

femtomc commented Sep 25, 2020 •

edited

Loading

mschauer commented Sep 25, 2020 •

edited

Loading

phipsgabler commented Sep 25, 2020

simeonschaub commented Sep 25, 2020 •

edited

Loading

mschauer commented Sep 25, 2020

phipsgabler commented Sep 25, 2020 •

edited

Loading

cscherrer commented Sep 25, 2020

cscherrer commented Sep 25, 2020

phipsgabler commented Sep 25, 2020

cscherrer commented Sep 26, 2020

@measure combinator #7

@measure combinator #7

Comments

cscherrer commented Sep 23, 2020 • edited Loading

cscherrer commented Sep 23, 2020

cscherrer commented Sep 24, 2020

cscherrer commented Sep 24, 2020

femtomc commented Sep 25, 2020 • edited Loading

mschauer commented Sep 25, 2020 • edited Loading

phipsgabler commented Sep 25, 2020

simeonschaub commented Sep 25, 2020 • edited Loading

mschauer commented Sep 25, 2020

phipsgabler commented Sep 25, 2020 • edited Loading

cscherrer commented Sep 25, 2020

cscherrer commented Sep 25, 2020

phipsgabler commented Sep 25, 2020

cscherrer commented Sep 26, 2020

cscherrer commented Sep 23, 2020 •

edited

Loading

femtomc commented Sep 25, 2020 •

edited

Loading

mschauer commented Sep 25, 2020 •

edited

Loading

simeonschaub commented Sep 25, 2020 •

edited

Loading

phipsgabler commented Sep 25, 2020 •

edited

Loading