Introduce ArrayLikeVariate as common supertype for Univariate, Multivariate and Matrixvariate #1313

oschulz · 2021-04-24T20:28:54Z

When writing generic code that deals with Univariate, Multivariate and Matrixvariate distributions, I often find myself having to code separate methods for each variate type. Having a common supertype NVariate{N} should make generic coding for distributions a bit easier.

I propose to introduce Union{Univariate,Multivariate,Matrixvariate} <: NVariate <: VariateForm here, instead of changing VariateForm to VariateForm{N}, to continue leaving room for custom non-scalar/matrix-like variate type (representated by other subtypes of VariateForm). ValueShapes.jl, for example, supports distributions with NamedTuple variates.

codecov-commenter · 2021-04-24T20:41:50Z

Codecov Report

Merging #1313 (891111f) into master (2af8703) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #1313   +/-   ##
=======================================
  Coverage   81.51%   81.51%           
=======================================
  Files         115      115           
  Lines        6626     6626           
=======================================
  Hits         5401     5401           
  Misses       1225     1225

Impacted Files	Coverage Δ
src/common.jl	`67.56% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2af8703...891111f. Read the comment docs.

azev77 · 2021-04-25T03:38:59Z

so in principle, can this accommodate random tensors?

oschulz · 2021-04-25T07:53:57Z

so in principle, can this accommodate random tensors?

Yes, that would be a logical next step. I'm prototyping something here, but not all tests pass yes. That would generalize ArrayDistribution and MatrixDistribution and MatrixReshaped to ArrayReshaped. I recently encountered some parameter estimation problems where distributions with higher-dimensional variates would come in very handy.

I thought it would be best to generalize Univariate, Multivariate and Matrixvariate first in a separate PR, though, since it touches far less code, the second step may require more thought/discussion.

devmotion · 2021-04-25T08:09:37Z

I think the proposal is reasonable, and it would even be non-breaking since it just introduces a new supertype in the type hierarchy 🙂 I wonder though if a more descriptive name such as e.g. ArrayVariate or ArrayLikeVariate should be used?

oschulz · 2021-04-25T08:35:17Z

it would even be non-breaking

Yes, also with the follow up step (generalizing ArrayDistribution to MatrixDistribution and MatrixReshaped to ArrayReshaped) I'd make sure it's non breaking.

I wonder though if a more descriptive name such as e.g. ArrayVariate or ArrayLikeVariate

Yes, I thought about ArrayVariate or something like that, too. ArrayVariate might not be a good choice, I think we definitely should cover Univariate under this umbrella, too. I thought about TensorVariate, but mathematicians might point out that the variates wouldn't necessarily be true tensors. I have no strong opinion on the name here, though, esp. if the Distributions core developers have a suggestion/preference.

oschulz · 2021-04-25T08:37:44Z

ArrayLikeVariate might be nice - MatrixDistribution could be generalized as ArrayLikeDistribution in the next step, then.

oschulz · 2021-04-25T08:41:48Z

Should we define something like const ArrayLikeValue{T<:Number} = Union{T,AbstractArray{T}} in the bargain? Non-exported, since it has high naming conflict potential?

devmotion · 2021-04-25T09:06:36Z

Should we define something like const ArrayLikeValue{T<:Number} = Union{T,AbstractArray{T}} in the bargain?

It is not clear to me when and how this would be useful. Do you have an example where it would be helpful?

oschulz · 2021-04-25T10:52:55Z

I just thought it would be elegant to have a common type for the variate values as well as the variates and distributions. But I guess that's something that would, if useful, emerge naturally during the second step (higher-dimensional variates).

How would you like me to proceed regarding NVariate vs. ArrayLikeVariate vs. possible other names?

devmotion · 2021-04-25T12:21:03Z

I'd prefer ArrayLikeVariate but maybe @andreasnoack or @matbesancon have different opinions or suggestions?

oschulz · 2021-04-30T10:00:33Z

Should I just go ahead with ArrayLikeVariate?

src/common.jl

mschauer · 2021-04-30T10:25:58Z

I'd suggest ShapedVariate{N}, as "shape" (queried by size) also works for numbers. Compare

julia> size(1)
()

julia> size([1,1])
(2,)

julia> Base.IteratorSize(1)
Base.HasShape{0}()

julia> Base.IteratorSize([1,1])
Base.HasShape{1}()

src/common.jl

mschauer · 2021-04-30T10:28:34Z

src/common.jl

-struct Matrixvariate <: VariateForm end
+
+"""
+`F <: NVariate{N}` specifies the form of the variate a sample for


Suggested change

`F <: NVariate{N}` specifies the form of the variate a sample for

`F <: NVariate{N}` specifies the `size` of the variates or samples for

How about "specifies the dimemsionality of the size ..."?

devmotion · 2021-04-30T10:47:09Z

On the other hand, Base.HasShape covers much more than just numbers and arrays. But maybe the intention was that NVariate covers these cases as well?

julia> Base.IteratorSize((rand() for _ in 1:3))
Base.HasShape{1}()

mschauer · 2021-04-30T10:58:41Z

That would be the right direction. Note that rand(MvNormal(I(3)), 4) is not a vector of vectors, so we already take the freedom of delivering the 3 elements of each sample in a convenient form and not as Vector

julia> rand(MvNormal(I(3)), 4)
3×4 Matrix{Float64}:
 -0.29361   -1.67339   0.0517635  -1.45977
  0.748131   0.982941  0.219542    0.651414
  0.53585    1.30659   0.321386   -2.03675

devmotion · 2021-04-30T11:11:00Z

Note that rand(MvNormal(I(3)), 4) is not a vector of vectors, so we already take the freedom of delivering the 3 elements of each sample in a convenient form and not as Vector

I don't think the behaviour of rand is relevant for this PR, is it? To me it seems this PR is only concerned with a new supertype for univariate, multi- and matrixvariate distributions and an appropriate name based on the structure of a single sample. With the iterator example, I just wanted to highlight that ShapedVariate{1} might indicate that it would also cover distributions where a single sample is an iterator - and I was not sure if this is the intention of the PR.

On a side note (I really think one should discuss this somewhere else): IMO it is a bit unfortunate that one uses this convention (not only in Distributions) instead of properly indicating the size and orientation of the samples with efficient AbstractVector wrappers such as ColVecs or RowVecs.

mschauer · 2021-04-30T11:16:00Z

What is a single sample if not what rand returns (in collected form) and what say logpdf accepts as input?

devmotion · 2021-04-30T11:26:49Z

Sure, a single sample is what rand(dist) returns. I'm confused now since I didn't intend to argue with this in my comment 😄 The main point was just: I think the shape of rand(MvNormal(I(3)), 4) does not matter for this PR and also not for the example

julia> Base.IteratorSize((rand() for _ in 1:3))
Base.HasShape{1}()

above. This example just tried to show that one might assume that ShapedVariate{1} does not only cover distributions for which rand(dist) isa Vector{Float64} but e.g. also distributions for which rand(dist) isa Base.Generator{<:Base.Iterators.ProductIterator{Tuple{UnitRange{Int64}, UnitRange{Int64}}}} (since the example is of such a type and has shape 1).

oschulz · 2021-04-30T11:50:25Z

I'd suggest ShapedVariate

I think ShapedVariate is too generic. Values can have very different shapes than just scalars and arrays (I have a whole ValueShapes.jl dedicated to this, with support for NamedTuple-shaped distributions, etc.).

oschulz · 2021-04-30T11:54:05Z

On the other hand, Base.HasShape covers much more than just numbers and arrays. But maybe the intention was that NVariate covers these cases as well?

No, this whole thing (call it NVariate or ArrayLikeVariate) was intended to cover "tensor-shaped" (scalar, matrix, n-dim array) variates. VariateForm would then be the supertype that can also cover variates that have different shapes than scalars and arrays - and where there will often not be an N.

oschulz · 2021-04-30T12:07:54Z

Note that rand(MvNormal(I(3)), 4) is not a vector of vectors, so we already take the freedom of delivering the 3 elements of each sample in a convenient form and not as Vector

In my opinion, the variates are still vectors, though:

It's indeed unfortunate that rand returns a flat matrix instead of a vector-of-vectors (or vector-of-vectors view of a matrix, like ArraysOfArrays.VectorOfSimilarVectors provides). I think this is just because it's an old convention, dating back to early Julia days, when views didn't use to be as cheap as they are now and there was no broadcast, etc.. After all, for MatrixVariate dists, rand does return a vectors of matrices. I would view this as a technical exception we do in the output rand(d::MultivariateDistribution, n), not as a semantic issue.

It's basically impossible to deprecate the behavior of rand(d::MultivariateDistribution, n), of course - but maybe at some point we can introduce a variant like rand(d, Repeat(n, ...)) that will always return an Array of the variate type of d (ideally backed by a flat array, ArraysOfArray.jl-style? Repeat is probably not the right choice, naming wise, of course.

oschulz · 2021-05-02T14:17:07Z

Ok, I've renamed it to ArrayLikeVariate, as suggested by @devmotion .

oschulz · 2021-05-12T12:48:46Z

Gentle bump, @devmotion do you think this can go ahead like it is now?

devmotion · 2021-05-12T12:51:30Z

I'm fine with it as it is, I would just prefer if some other maintainer would approve it as well.

devmotion

Maybe one could also mention ArrayVariate in the documentation.

Serves a common supertype for Univariate, Multivariate and Matrixvariate to ease writing generic code that deals with distributions with scalar/array-like variates.

Co-authored-by: Moritz Schauer <[email protected]>

oschulz · 2021-05-12T17:37:53Z

Maybe one could also mention ArrayVariate in the documentation.

Oh, good point, just added that.

oschulz · 2021-05-12T17:39:58Z

I would just prefer if some other maintainer would approve it as well.

Sure, I understand, esp. in this case.

Another maintainer, pretty please? :-)

devmotion · 2021-05-12T17:58:56Z

I requested some additional reviews, so hopefully someone will comment on the PR 🙂

matbesancon · 2021-05-12T19:09:40Z

there is one comment left by @mschauer, otherwise looks good

Co-authored-by: Moritz Schauer <[email protected]>

oschulz · 2021-05-13T06:31:01Z

there is one comment left by @mschauer, otherwise looks good

Sorry, overlooked that one, just committed that suggestion.

oschulz · 2021-05-13T10:03:02Z

Thanks!

I would go ahead then generalizing MatrixDistribution to ArrayDistribution, MatrixReshaped to ArrayReshaped, and so on (backward-compatible, keeping MatrixDistribution and MatrixReshaped as typedefs), if there are no objections?

I have an early draft here (doesn't pass tests yet): master...oschulz:array-distributions

mschauer · 2021-05-13T10:14:28Z

Why would we need ArrayDistribution if we have ArrayLikeDistribution? Brings back the same issue: why should the type of the distribution care about the container of the samples? All we have to state is the type of the elements and the dimension of the result and then we can fill it in any container we want and reversely we can compute logpdf(::ArrayLike, x) as long as x supports getindex. Or do we want logpdf(D::VectorDistribution, ::StaticVector) return -Inf because "a static vector is not a vector and the support of D is only vectors!!" . That doesn't make sense to me.

devmotion · 2021-05-13T10:26:50Z

This suggestion is about the element types and not the containers, as far as I understand. It only brings more structure to the current system of MultivariateDistributions and MatrixDistributions etc. I imagine that this makes it e.g. easier to generalize Product to arrays of higher dimensions Currently DistributionsAD contains some support for MatrixDistributions that are a product distribution of a matrix of univariate distributions but IMO this should just be supported by Product, and sometimes you might even want a product distribution whose samples are 3D arrays.

mschauer · 2021-05-13T10:58:30Z

But that is my point: If you don't want code duplication between a product of univariate and a multivariate distribution you need to avoid to bake difference between those things into the abstract super types. Now we are talking about baking difference between Array and ArrayLike into the Distribution hierarchy. My question is: what is this difference good for (except that people have to define const ArrayorArrayLike = Union{Array,ArrayLike})?

devmotion · 2021-05-13T11:04:25Z

Maybe I misunderstand the proposal but I don't think any of this is part of the suggestion above. To me it is mainly a more convenient alias that allows you to make use of these more general VariateForms (which e.g. can be useful for more general Products):

const ArrayDistribution{S<:ValueSupport,N}      = Distribution{ArrayLikeVariate{N},S}

There is nothing conceptually new here, it's just more convenient and more general than the aliases MatrixDistribution, MultivariateDistribution etc.

mschauer · 2021-05-13T11:32:40Z

That is be a misinterpretation on my side, I thought the plan was to introduce ArrayDistribution{S<:ValueSupport,N} = Distribution{ArrayVariate{N},S} with ArrayVariate instead of ArrayLikeVariate.

oschulz · 2021-05-13T11:33:00Z

@mschauer: Why would we need ArrayDistribution if we have ArrayLikeDistribution

Uh, we don't have an ArrayLikeDistribution currenty, right?

@mschauer: why should the type of the distribution care about the container of the samples?

I assume you mean the container for multiple samples/variates? The type of the distribution should definitely convey information about the type of a single sample/variate, though (i.e. the return type of rand(dist)), right? So rand(dist) should return, and pdf(dist, x) should accept a (subtype of) Real for Distribution{ArrayLikeVariate{0}}, and an AbstractArray{<:Real,1} for Distribution{ArrayLikeVariate{1}}, and an AbstractArray{<:Real,2} for Distribution{ArrayLikeVariate{2}}, and so on, right?

@devmotion: Maybe I misunderstand the proposal but I don't think any of this is part of the suggestion above. To me it is mainly a more convenient alias

Indeed - we currently have MatrixDistribution as a convencience alias for Distribution{MatrixVariate}, so I thought it would be natural to add a typedef ArrayDistribution as a convenience alias for Distribution{ArrayLikeVariate}. Maybe ArrayLikeDistribution would indeed be a better name, though (I drafted this before we renamed NVariate to ArrayLikeVariate).

@mschauer: If you don't want code duplication between a product of univariate and a multivariate distribution you need to avoid to bake difference between those things into the abstract super types. Now we are talking about baking difference between Array and ArrayLike into the Distribution hierarchy

Oh, no - sorry if I have been unclear, I don't want that at all! A product of univariate and a multivariate distribution should of course have the same abstract supertype. Which would be MultivariateDistribution, which would be a convenience alias for ArrayDistribution{1}(alternatively ArrayLikeDistribution{1}), which would be a convenience alias for Distribution{ArrayLikeVariate{1}}.

@mschauer: I thought the plan was to introduce ArrayDistribution{S<:ValueSupport,N} = Distribution{ArrayVariate{N},S} with ArrayVariate instead of ArrayLikeVariate

No, no, I definitely didn't want to introduce a separate ArrayVariate.

oschulz · 2021-05-13T11:38:30Z

I imagine that this makes it e.g. easier to generalize Product to arrays of higher dimensions

Oh, yes, it would be nice to be able to make a product of multivariate distributions, resulting in a matrixvariate distribution, and so on.

mschauer · 2021-05-13T11:48:21Z

Just a bit about my perspective, I think we agree all. I think we need to avoid repeating the same mistakes:

We forced Distributions to be Continuous or Discrete and suddenly couldn’t have a mixture between a Dirac spike and a continuous slab. Still some issues left, what should be the super type of Normal(0.0, 0.0) etc.?  We also cannot give an appropriate pdf for a spike and slab.
  We forced Distributions to be both Discrete and Integer valued at the same time and couldn’t have discrete distributions supported on sets of floats. 
We forced Distributions to be Univariate or Multivariate and can’t use MvNormal for uncertainty quantification of both numbers and vectors.  
We made all probabilities, densities Float64 and suffered from that because automatic differentiation came up.
 We can also not easily have random combinations, random trees, random trajectories, and plenty of other random things without breaking our own assumptions.

devmotion · 2021-05-13T12:11:23Z

I agree, there are many things that can be improved in Distributions and to a large extent problems are caused by design choices that turned out to be problematic later on (to be honest, some of these choices such as Float64 were probably also caused by limitations of Julia and the state of AD at the time when they were made). However, here it doesn't seem to me that there are any design choices to be made since no new concepts or structures are introduced. In general, I think the best way to improve Distributions is to make many small improvements, and this seems to be an improvement to me, even though it does not fix the more general problems.

  We forced Distributions to be both Discrete and Integer valued at the same time and couldn’t have discrete distributions supported on sets of floats

This is not the case anymore, e.g., Dirac and DiscreteNonParametric support sets of floats (but maybe that's why you wrote "couldn't have" instead of "can't have"?).

oschulz · 2021-05-13T22:49:17Z

Yes, that's what I think as well.

matbesancon requested a review from mschauer April 25, 2021 12:29

mschauer reviewed Apr 30, 2021

View reviewed changes

src/common.jl Outdated Show resolved Hide resolved

mschauer reviewed Apr 30, 2021

View reviewed changes

src/common.jl Outdated Show resolved Hide resolved

mschauer reviewed Apr 30, 2021

View reviewed changes

oschulz force-pushed the nvariate branch from 8b5aa1b to 3f03751 Compare May 2, 2021 13:49

oschulz mentioned this pull request May 2, 2021

Constant Distribution #1098

Closed

oschulz changed the title ~~Introduce NVariate as common supertype for Univariate, Multivariate and Matrixvariate~~ Introduce ArrayLikeVariate as common supertype for Univariate, Multivariate and Matrixvariate May 2, 2021

devmotion requested review from matbesancon and andreasnoack May 12, 2021 12:51

devmotion approved these changes May 12, 2021

View reviewed changes

oschulz and others added 4 commits May 12, 2021 19:31

Introduce NVariate

ed0f572

Serves a common supertype for Univariate, Multivariate and Matrixvariate to ease writing generic code that deals with distributions with scalar/array-like variates.

Fix whitespace in definition of NVariate

70e8605

Co-authored-by: Moritz Schauer <[email protected]>

Rename NVariate to ArrayLikeVariate

6c6785e

Refer to ArrayLikeVariate in docs

be6e2df

oschulz force-pushed the nvariate branch from 3f03751 to be6e2df Compare May 12, 2021 17:37

Update VariateForm docstring

891111f

Co-authored-by: Moritz Schauer <[email protected]>

matbesancon approved these changes May 13, 2021

View reviewed changes

devmotion merged commit b35f091 into JuliaStats:master May 13, 2021

oschulz deleted the nvariate branch May 13, 2021 09:47

	`F <: NVariate{N}` specifies the form of the variate a sample for
	`F <: NVariate{N}` specifies the `size` of the variates or samples for

Introduce ArrayLikeVariate as common supertype for Univariate, Multivariate and Matrixvariate #1313

Introduce ArrayLikeVariate as common supertype for Univariate, Multivariate and Matrixvariate #1313

Conversation

oschulz commented Apr 24, 2021

codecov-commenter commented Apr 24, 2021 • edited Loading

Codecov Report

azev77 commented Apr 25, 2021

oschulz commented Apr 25, 2021

devmotion commented Apr 25, 2021

oschulz commented Apr 25, 2021 • edited Loading

oschulz commented Apr 25, 2021

oschulz commented Apr 25, 2021

devmotion commented Apr 25, 2021 • edited Loading

oschulz commented Apr 25, 2021

devmotion commented Apr 25, 2021

oschulz commented Apr 30, 2021

mschauer commented Apr 30, 2021

mschauer Apr 30, 2021 • edited Loading

Choose a reason for hiding this comment

oschulz Apr 30, 2021

Choose a reason for hiding this comment

devmotion commented Apr 30, 2021

mschauer commented Apr 30, 2021

devmotion commented Apr 30, 2021

mschauer commented Apr 30, 2021

devmotion commented Apr 30, 2021

oschulz commented Apr 30, 2021

oschulz commented Apr 30, 2021

oschulz commented Apr 30, 2021

oschulz commented May 2, 2021

oschulz commented May 12, 2021

devmotion commented May 12, 2021

devmotion left a comment

Choose a reason for hiding this comment

oschulz commented May 12, 2021

oschulz commented May 12, 2021

devmotion commented May 12, 2021

matbesancon commented May 12, 2021

oschulz commented May 13, 2021

oschulz commented May 13, 2021

mschauer commented May 13, 2021 • edited Loading

devmotion commented May 13, 2021

mschauer commented May 13, 2021

devmotion commented May 13, 2021

mschauer commented May 13, 2021

oschulz commented May 13, 2021 • edited Loading

oschulz commented May 13, 2021

mschauer commented May 13, 2021

devmotion commented May 13, 2021

oschulz commented May 13, 2021

codecov-commenter commented Apr 24, 2021 •

edited

Loading

oschulz commented Apr 25, 2021 •

edited

Loading

devmotion commented Apr 25, 2021 •

edited

Loading

mschauer Apr 30, 2021 •

edited

Loading

mschauer commented May 13, 2021 •

edited

Loading

oschulz commented May 13, 2021 •

edited

Loading