Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to plotting PMFs for discrete distributions #451

Merged
merged 19 commits into from
Jun 23, 2021

Conversation

sethaxen
Copy link
Member

@sethaxen sethaxen commented Jun 18, 2021

This PR fixes #450 and fixes #287.

@devmotion
Copy link
Contributor

I think it would be easier (and probably faster) to not use detault_range but

  • to look up min_dist, max_dist = extrema(dist) (that will always work)
  • to use support if both are bounded (this should also work, the bug in Distributions only affects unbounded support)
  • and to use the quantiles with a unit range in the other case (which is likely wrong in some cases but probably the best one can do right now)

@sethaxen sethaxen marked this pull request as ready for review June 18, 2021 12:29
Copy link
Contributor

@devmotion devmotion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's the best that can be done currently. Maybe it would be a good opportunity to address #287 as well.

src/distributions.jl Outdated Show resolved Hide resolved
src/distributions.jl Show resolved Hide resolved
@sethaxen sethaxen changed the title Plot distributions with non-integer support Improvements to plotting PMFs for discrete distributions Jun 18, 2021
@sethaxen
Copy link
Member Author

Maybe it would be a good opportunity to address #287 as well.

Done!

julia> plot(Binomial(50, 0.5))

d

src/distributions.jl Outdated Show resolved Hide resolved
@sethaxen
Copy link
Member Author

I made the same work for discrete mixtures. e.g. here is zero-inflated Poisson:

julia> zip = MixtureModel([Dirac(0), Poisson(10)], [0.1, 0.9])
MixtureModel{Distribution{Univariate, Discrete}}(K = 2)
components[1] (prior = 0.1000): Dirac{Int64}(value=0)
components[2] (prior = 0.9000): Poisson{Float64}=10.0)

julia> plot(zip)

julia> plot(zip; components=false)

Copy link
Member

@daschw daschw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks a lot!

Copy link
Contributor

@devmotion devmotion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if one has to include markers for :sticks by default?

src/distributions.jl Outdated Show resolved Hide resolved
end

function default_range(m::Distributions.MixtureModel, alpha = 0.0001)
minval = maxval = 0.0
minval = maxval = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is more likely to be type unstable than the current implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? It turns out this is necessary for the zero-inflated Poisson to not make everything a Float64. Since logpdf for discrete MixtureModel only accepts points of type Int, this turning everything into floats causes failures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in the loop the type of minval and maxval will be changed for most distributions: for most distributions minimum, maximum and quantile do not return values of type Int. It's also not guaranteed that they are of type Float64 but this is definitely more likely for continuous distributions (and it would still be type stable for distributions where these values are of type Int if one uses min and max to update minval and maxval in the loop).

logpdf and pdf accept Real regardless if the distribution is discrete or not. If there's a problem with MixtureModel, it has to be fixed in Distributions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in the loop the type of minval and maxval will be changed for most distributions: for most distributions minimum, maximum and quantile do not return values of type Int. It's also not guaranteed that they are of type Float64 but this is definitely more likely for continuous distributions (and it would still be type stable for distributions where these values are of type Int if one uses min and max to update minval and maxval in the loop).

We should support both though, not just continuous distributions. I pushed an implementation using mapreduce that is simpler and does this.

src/distributions.jl Outdated Show resolved Hide resolved
src/distributions.jl Outdated Show resolved Hide resolved
src/distributions.jl Outdated Show resolved Hide resolved
@sethaxen
Copy link
Member Author

I wonder if one has to include markers for :sticks by default?

Can you clarify what you mean?

@devmotion
Copy link
Contributor

If it is necessary to set markershape or if one should just not plot markers with the default settings.

@sethaxen
Copy link
Member Author

If it is necessary to set markershape or if one should just not plot markers with the default settings.

Setting markershape is necessary to show the markers as suggested in the original issue. Personally, I prefer hair plots without markers and would prefer the simplicity of not explicitly showing them and letting the user do it if they want it. What do you think?

@devmotion
Copy link
Contributor

Personally, I prefer hair plots without markers and would prefer the simplicity of not explicitly showing them and letting the user do it if they want it.

Yes, my question was if you think it would be preferrable to not show them explicitly. I prefer the version without markers (IMO it is a bit cleaner, in particular if there are many values to plot) and it seems simple enough to specify markershape if one wants to plot points as well.

@sethaxen
Copy link
Member Author

sethaxen commented Jun 20, 2021

Okay, I simplified the implementation to not show markers by default.

julia> plot(Binomial(50, 0.5))

For plots with few points, in my view this looks worse:

julia> plot(plot(Binomial(5, 0.1)), plot(Binomial(5, 0.1); markershape=:circle))

But it looks much better for plots with many points:

julia> plot(plot(Poisson(500)), plot(Poisson(500); markershape=:circle); legend=:left)

@BeastyBlacksmith BeastyBlacksmith merged commit fadcdf7 into JuliaPlots:master Jun 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants