Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ArrayLikeVariate as common supertype for Univariate, Multivariate and Matrixvariate #1313

Merged
merged 5 commits into from
May 13, 2021

Conversation

oschulz
Copy link
Contributor

@oschulz oschulz commented Apr 24, 2021

When writing generic code that deals with Univariate, Multivariate and Matrixvariate distributions, I often find myself having to code separate methods for each variate type. Having a common supertype NVariate{N} should make generic coding for distributions a bit easier.

I propose to introduce Union{Univariate,Multivariate,Matrixvariate} <: NVariate <: VariateForm here, instead of changing VariateForm to VariateForm{N}, to continue leaving room for custom non-scalar/matrix-like variate type (representated by other subtypes of VariateForm). ValueShapes.jl, for example, supports distributions with NamedTuple variates.

@codecov-commenter
Copy link

codecov-commenter commented Apr 24, 2021

Codecov Report

Merging #1313 (891111f) into master (2af8703) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1313   +/-   ##
=======================================
  Coverage   81.51%   81.51%           
=======================================
  Files         115      115           
  Lines        6626     6626           
=======================================
  Hits         5401     5401           
  Misses       1225     1225           
Impacted Files Coverage Δ
src/common.jl 67.56% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2af8703...891111f. Read the comment docs.

@azev77
Copy link
Contributor

azev77 commented Apr 25, 2021

so in principle, can this accommodate random tensors?

@oschulz
Copy link
Contributor Author

oschulz commented Apr 25, 2021

so in principle, can this accommodate random tensors?

Yes, that would be a logical next step. I'm prototyping something here, but not all tests pass yes. That would generalize ArrayDistribution and MatrixDistribution and MatrixReshaped to ArrayReshaped. I recently encountered some parameter estimation problems where distributions with higher-dimensional variates would come in very handy.

I thought it would be best to generalize Univariate, Multivariate and Matrixvariate first in a separate PR, though, since it touches far less code, the second step may require more thought/discussion.

@devmotion
Copy link
Member

I think the proposal is reasonable, and it would even be non-breaking since it just introduces a new supertype in the type hierarchy 🙂 I wonder though if a more descriptive name such as e.g. ArrayVariate or ArrayLikeVariate should be used?

@oschulz
Copy link
Contributor Author

oschulz commented Apr 25, 2021

it would even be non-breaking

Yes, also with the follow up step (generalizing ArrayDistribution to MatrixDistribution and MatrixReshaped to ArrayReshaped) I'd make sure it's non breaking.

I wonder though if a more descriptive name such as e.g. ArrayVariate or ArrayLikeVariate

Yes, I thought about ArrayVariate or something like that, too. ArrayVariate might not be a good choice, I think we definitely should cover Univariate under this umbrella, too. I thought about TensorVariate, but mathematicians might point out that the variates wouldn't necessarily be true tensors. I have no strong opinion on the name here, though, esp. if the Distributions core developers have a suggestion/preference.

@oschulz
Copy link
Contributor Author

oschulz commented Apr 25, 2021

ArrayLikeVariate might be nice - MatrixDistribution could be generalized as ArrayLikeDistribution in the next step, then.

@oschulz
Copy link
Contributor Author

oschulz commented Apr 25, 2021

Should we define something like const ArrayLikeValue{T<:Number} = Union{T,AbstractArray{T}} in the bargain? Non-exported, since it has high naming conflict potential?

@devmotion
Copy link
Member

devmotion commented Apr 25, 2021

Should we define something like const ArrayLikeValue{T<:Number} = Union{T,AbstractArray{T}} in the bargain?

It is not clear to me when and how this would be useful. Do you have an example where it would be helpful?

@oschulz
Copy link
Contributor Author

oschulz commented Apr 25, 2021

I just thought it would be elegant to have a common type for the variate values as well as the variates and distributions. But I guess that's something that would, if useful, emerge naturally during the second step (higher-dimensional variates).

How would you like me to proceed regarding NVariate vs. ArrayLikeVariate vs. possible other names?

@devmotion
Copy link
Member

I'd prefer ArrayLikeVariate but maybe @andreasnoack or @matbesancon have different opinions or suggestions?

@matbesancon matbesancon requested a review from mschauer April 25, 2021 12:29
@oschulz
Copy link
Contributor Author

oschulz commented Apr 30, 2021

Should I just go ahead with ArrayLikeVariate?

@mschauer
Copy link
Member

I'd suggest ShapedVariate{N}, as "shape" (queried by size) also works for numbers. Compare

julia> size(1)
()

julia> size([1,1])
(2,)

julia> Base.IteratorSize(1)
Base.HasShape{0}()

julia> Base.IteratorSize([1,1])
Base.HasShape{1}()

src/common.jl Outdated
struct Matrixvariate <: VariateForm end

"""
`F <: NVariate{N}` specifies the form of the variate a sample for
Copy link
Member

@mschauer mschauer Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`F <: NVariate{N}` specifies the form of the variate a sample for
`F <: NVariate{N}` specifies the `size` of the variates or samples for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "specifies the dimemsionality of the size ..."?

@devmotion
Copy link
Member

On the other hand, Base.HasShape covers much more than just numbers and arrays. But maybe the intention was that NVariate covers these cases as well?

julia> Base.IteratorSize((rand() for _ in 1:3))
Base.HasShape{1}()

@mschauer
Copy link
Member

That would be the right direction. Note that rand(MvNormal(I(3)), 4) is not a vector of vectors, so we already take the freedom of delivering the 3 elements of each sample in a convenient form and not as Vector

julia> rand(MvNormal(I(3)), 4)
3×4 Matrix{Float64}:
 -0.29361   -1.67339   0.0517635  -1.45977
  0.748131   0.982941  0.219542    0.651414
  0.53585    1.30659   0.321386   -2.03675

@devmotion
Copy link
Member

Note that rand(MvNormal(I(3)), 4) is not a vector of vectors, so we already take the freedom of delivering the 3 elements of each sample in a convenient form and not as Vector

I don't think the behaviour of rand is relevant for this PR, is it? To me it seems this PR is only concerned with a new supertype for univariate, multi- and matrixvariate distributions and an appropriate name based on the structure of a single sample. With the iterator example, I just wanted to highlight that ShapedVariate{1} might indicate that it would also cover distributions where a single sample is an iterator - and I was not sure if this is the intention of the PR.

On a side note (I really think one should discuss this somewhere else): IMO it is a bit unfortunate that one uses this convention (not only in Distributions) instead of properly indicating the size and orientation of the samples with efficient AbstractVector wrappers such as ColVecs or RowVecs.

@mschauer
Copy link
Member

What is a single sample if not what rand returns (in collected form) and what say logpdf accepts as input?

@devmotion
Copy link
Member

Sure, a single sample is what rand(dist) returns. I'm confused now since I didn't intend to argue with this in my comment 😄 The main point was just: I think the shape of rand(MvNormal(I(3)), 4) does not matter for this PR and also not for the example

julia> Base.IteratorSize((rand() for _ in 1:3))
Base.HasShape{1}()

above. This example just tried to show that one might assume that ShapedVariate{1} does not only cover distributions for which rand(dist) isa Vector{Float64} but e.g. also distributions for which rand(dist) isa Base.Generator{<:Base.Iterators.ProductIterator{Tuple{UnitRange{Int64}, UnitRange{Int64}}}} (since the example is of such a type and has shape 1).

@oschulz
Copy link
Contributor Author

oschulz commented Apr 30, 2021

I'd suggest ShapedVariate

I think ShapedVariate is too generic. Values can have very different shapes than just scalars and arrays (I have a whole ValueShapes.jl dedicated to this, with support for NamedTuple-shaped distributions, etc.).

@oschulz
Copy link
Contributor Author

oschulz commented Apr 30, 2021

On the other hand, Base.HasShape covers much more than just numbers and arrays. But maybe the intention was that NVariate covers these cases as well?

No, this whole thing (call it NVariate or ArrayLikeVariate) was intended to cover "tensor-shaped" (scalar, matrix, n-dim array) variates. VariateForm would then be the supertype that can also cover variates that have different shapes than scalars and arrays - and where there will often not be an N.

@oschulz
Copy link
Contributor Author

oschulz commented Apr 30, 2021

Note that rand(MvNormal(I(3)), 4) is not a vector of vectors, so we already take the freedom of delivering the 3 elements of each sample in a convenient form and not as Vector

In my opinion, the variates are still vectors, though:

It's indeed unfortunate that rand returns a flat matrix instead of a vector-of-vectors (or vector-of-vectors view of a matrix, like ArraysOfArrays.VectorOfSimilarVectors provides). I think this is just because it's an old convention, dating back to early Julia days, when views didn't use to be as cheap as they are now and there was no broadcast, etc.. After all, for MatrixVariate dists, rand does return a vectors of matrices. I would view this as a technical exception we do in the output rand(d::MultivariateDistribution, n), not as a semantic issue.

It's basically impossible to deprecate the behavior of rand(d::MultivariateDistribution, n), of course - but maybe at some point we can introduce a variant like rand(d, Repeat(n, ...)) that will always return an Array of the variate type of d (ideally backed by a flat array, ArraysOfArray.jl-style? Repeat is probably not the right choice, naming wise, of course.

@oschulz oschulz mentioned this pull request May 2, 2021
@oschulz oschulz changed the title Introduce NVariate as common supertype for Univariate, Multivariate and Matrixvariate Introduce ArrayLikeVariate as common supertype for Univariate, Multivariate and Matrixvariate May 2, 2021
@oschulz
Copy link
Contributor Author

oschulz commented May 2, 2021

Ok, I've renamed it to ArrayLikeVariate, as suggested by @devmotion .

@oschulz
Copy link
Contributor Author

oschulz commented May 12, 2021

Gentle bump, @devmotion do you think this can go ahead like it is now?

@devmotion
Copy link
Member

I'm fine with it as it is, I would just prefer if some other maintainer would approve it as well.

Copy link
Member

@devmotion devmotion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one could also mention ArrayVariate in the documentation.

oschulz and others added 4 commits May 12, 2021 19:31
Serves a common supertype for Univariate, Multivariate and
Matrixvariate to ease writing generic code that deals with
distributions with scalar/array-like variates.
@oschulz
Copy link
Contributor Author

oschulz commented May 12, 2021

Maybe one could also mention ArrayVariate in the documentation.

Oh, good point, just added that.

@oschulz
Copy link
Contributor Author

oschulz commented May 12, 2021

I would just prefer if some other maintainer would approve it as well.

Sure, I understand, esp. in this case.

Another maintainer, pretty please? :-)

@devmotion
Copy link
Member

I requested some additional reviews, so hopefully someone will comment on the PR 🙂

@matbesancon
Copy link
Member

there is one comment left by @mschauer, otherwise looks good

Co-authored-by: Moritz Schauer <[email protected]>
@oschulz
Copy link
Contributor Author

oschulz commented May 13, 2021

there is one comment left by @mschauer, otherwise looks good

Sorry, overlooked that one, just committed that suggestion.

@devmotion devmotion merged commit b35f091 into JuliaStats:master May 13, 2021
@oschulz oschulz deleted the nvariate branch May 13, 2021 09:47
@oschulz
Copy link
Contributor Author

oschulz commented May 13, 2021

Thanks!

I would go ahead then generalizing MatrixDistribution to ArrayDistribution, MatrixReshaped to ArrayReshaped, and so on (backward-compatible, keeping MatrixDistribution and MatrixReshaped as typedefs), if there are no objections?

I have an early draft here (doesn't pass tests yet): master...oschulz:array-distributions

@mschauer
Copy link
Member

mschauer commented May 13, 2021

Why would we need ArrayDistribution if we have ArrayLikeDistribution? Brings back the same issue: why should the type of the distribution care about the container of the samples? All we have to state is the type of the elements and the dimension of the result and then we can fill it in any container we want and reversely we can compute logpdf(::ArrayLike, x) as long as x supports getindex. Or do we want logpdf(D::VectorDistribution, ::StaticVector) return -Inf because "a static vector is not a vector and the support of D is only vectors!!" . That doesn't make sense to me.

@devmotion
Copy link
Member

This suggestion is about the element types and not the containers, as far as I understand. It only brings more structure to the current system of MultivariateDistributions and MatrixDistributions etc. I imagine that this makes it e.g. easier to generalize Product to arrays of higher dimensions Currently DistributionsAD contains some support for MatrixDistributions that are a product distribution of a matrix of univariate distributions but IMO this should just be supported by Product, and sometimes you might even want a product distribution whose samples are 3D arrays.

@mschauer
Copy link
Member

But that is my point: If you don't want code duplication between a product of univariate and a multivariate distribution you need to avoid to bake difference between those things into the abstract super types. Now we are talking about baking difference between Array and ArrayLike into the Distribution hierarchy. My question is: what is this difference good for (except that people have to define const ArrayorArrayLike = Union{Array,ArrayLike})?

@devmotion
Copy link
Member

Maybe I misunderstand the proposal but I don't think any of this is part of the suggestion above. To me it is mainly a more convenient alias that allows you to make use of these more general VariateForms (which e.g. can be useful for more general Products):

const ArrayDistribution{S<:ValueSupport,N}      = Distribution{ArrayLikeVariate{N},S}

There is nothing conceptually new here, it's just more convenient and more general than the aliases MatrixDistribution, MultivariateDistribution etc.

@mschauer
Copy link
Member

That is be a misinterpretation on my side, I thought the plan was to introduce ArrayDistribution{S<:ValueSupport,N} = Distribution{ArrayVariate{N},S} with ArrayVariate instead of ArrayLikeVariate.

@oschulz
Copy link
Contributor Author

oschulz commented May 13, 2021

@mschauer: Why would we need ArrayDistribution if we have ArrayLikeDistribution

Uh, we don't have an ArrayLikeDistribution currenty, right?

@mschauer: why should the type of the distribution care about the container of the samples?

I assume you mean the container for multiple samples/variates? The type of the distribution should definitely convey information about the type of a single sample/variate, though (i.e. the return type of rand(dist)), right? So rand(dist) should return, and pdf(dist, x) should accept a (subtype of) Real for Distribution{ArrayLikeVariate{0}}, and an AbstractArray{<:Real,1} for Distribution{ArrayLikeVariate{1}}, and an AbstractArray{<:Real,2} for Distribution{ArrayLikeVariate{2}}, and so on, right?

@devmotion: Maybe I misunderstand the proposal but I don't think any of this is part of the suggestion above. To me it is mainly a more convenient alias

Indeed - we currently have MatrixDistribution as a convencience alias for Distribution{MatrixVariate}, so I thought it would be natural to add a typedef ArrayDistribution as a convenience alias for Distribution{ArrayLikeVariate}. Maybe ArrayLikeDistribution would indeed be a better name, though (I drafted this before we renamed NVariate to ArrayLikeVariate).

@mschauer: If you don't want code duplication between a product of univariate and a multivariate distribution you need to avoid to bake difference between those things into the abstract super types. Now we are talking about baking difference between Array and ArrayLike into the Distribution hierarchy

Oh, no - sorry if I have been unclear, I don't want that at all! A product of univariate and a multivariate distribution should of course have the same abstract supertype. Which would be MultivariateDistribution, which would be a convenience alias for ArrayDistribution{1}(alternatively ArrayLikeDistribution{1}), which would be a convenience alias for Distribution{ArrayLikeVariate{1}}.

@mschauer: I thought the plan was to introduce ArrayDistribution{S<:ValueSupport,N} = Distribution{ArrayVariate{N},S} with ArrayVariate instead of ArrayLikeVariate

No, no, I definitely didn't want to introduce a separate ArrayVariate.

@oschulz
Copy link
Contributor Author

oschulz commented May 13, 2021

I imagine that this makes it e.g. easier to generalize Product to arrays of higher dimensions

Oh, yes, it would be nice to be able to make a product of multivariate distributions, resulting in a matrixvariate distribution, and so on.

@mschauer
Copy link
Member

Just a bit about my perspective, I think we agree all. I think we need to avoid repeating the same mistakes:

  • We forced Distributions to be Continuous or Discrete and suddenly couldn’t have a mixture between a Dirac spike and a continuous slab. Still some issues left, what should be the super type of Normal(0.0, 0.0) etc.?
 We also cannot give an appropriate pdf for a spike and slab.

  • 

We forced Distributions to be both Discrete and Integer valued at the same time and couldn’t have discrete distributions supported on sets of floats.


  • We forced Distributions to be Univariate or Multivariate and can’t use MvNormal for uncertainty quantification of both numbers and vectors.



  • We made all probabilities, densities Float64 and suffered from that because automatic differentiation came up.

  • 
We can also not easily have random combinations, random trees, random trajectories, and plenty of other random things without breaking our own assumptions.

@devmotion
Copy link
Member

I agree, there are many things that can be improved in Distributions and to a large extent problems are caused by design choices that turned out to be problematic later on (to be honest, some of these choices such as Float64 were probably also caused by limitations of Julia and the state of AD at the time when they were made). However, here it doesn't seem to me that there are any design choices to be made since no new concepts or structures are introduced. In general, I think the best way to improve Distributions is to make many small improvements, and this seems to be an improvement to me, even though it does not fix the more general problems.



We forced Distributions to be both Discrete and Integer valued at the same time and couldn’t have discrete distributions supported on sets of floats

This is not the case anymore, e.g., Dirac and DiscreteNonParametric support sets of floats (but maybe that's why you wrote "couldn't have" instead of "can't have"?).

@oschulz
Copy link
Contributor Author

oschulz commented May 13, 2021

Yes, that's what I think as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants