-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding the Poisson distribution #15814
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Believe these are the maven-checks errors you're seeing. You may also need to rebase for the other errors that look unrelated.
presto-main/src/main/java/com/facebook/presto/operator/scalar/MathFunctions.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/scalar/MathFunctions.java
Outdated
Show resolved
Hide resolved
4a26de4
to
9213486
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just some nits. Please fix the checkstyle error.
presto-main/src/test/java/com/facebook/presto/operator/scalar/TestMathFunctions.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/scalar/MathFunctions.java
Show resolved
Hide resolved
0ab7b2a
to
3311a86
Compare
A point to notice: A followup to this discussion is that I should also change inverse_chisquare_cdf to not allow p=1 (right now it allows it, and returns Inf as expected, but that cases an inconsistent behavior between it and inverse_normal_cdf) |
(I may wish to move this to int from bigint, need to verify further. https://commons.apache.org/proper/commons-math/javadocs/api-3.5/org/apache/commons/math3/distribution/IntegerDistribution.html ) |
Both checkstyle error and test failures are related. Please fix |
Will do, thanks.
…On Wed, Apr 28, 2021, 22:45 Rongrong Zhong ***@***.***> wrote:
Both checkstyle error and test failures are related. Please fix
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#15814 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHOJBVCEF2IM57IVLX4FRTTLBQVBANCNFSM4ZAHIVSA>
.
|
9b5c9c9
to
c1f94d9
Compare
@rongrong - diff was fixed, and all tests now came back green. It's back to you for review :) |
929026e
to
01c3bac
Compare
Thanks @rongrong , I've now updated the text in the diff. I've also expanded a bit the description of inverse_poisson_cdf so it's clearer which value it return. |
Hey @rongrong - this diff includes all the fixes we've discussed. Could you please review and let me know if it's good to merge, or if there are other steps to take? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last nits. Thanks!
Thanks @rongrong - made the fixes, diff is now ready for merging :) |
Thank you, @talgalili, for the contribution. |
Thanks @mbasmanova for the help in merging this. :) |
@SqlType(StandardTypes.INTEGER) long value) | ||
{ | ||
checkCondition(value >= 0, INVALID_FUNCTION_ARGUMENT, "value must be a non-negative integer"); | ||
checkCondition(lambda > 0, INVALID_FUNCTION_ARGUMENT, "lambda must be greater than 0"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the library API is already doing this check, I say just do a try/catch and throw user_error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Since this is the method used by all the distribution functions (i.e.: normal, beta, chi-square, binomial), do you think it should be changed there as well?
If so - could you please help with this change? (I'm not experienced in Java, so want to make sure I understand what you're proposing and which error will be thrown)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do something like:
try { ... } catch(NotStrictlyPositiveException notStrictlyPositiveException) { throw new PrestoException(GENERIC_USER_ERROR, ...)
Look at StandardErrorCodes.java and other files to see the pattern what they do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And yeah it will be good to do for all of them.
{ | ||
checkCondition(value >= 0, INVALID_FUNCTION_ARGUMENT, "value must be a non-negative integer"); | ||
checkCondition(lambda > 0, INVALID_FUNCTION_ARGUMENT, "lambda must be greater than 0"); | ||
PoissonDistribution distribution = new PoissonDistribution(lambda); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the lambda going to be generally fixed in a query? If so, you should find a way to avoid new object creation to improve memory perf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also interesting. As I wrote to your other comment - this is the method used by all the distribution functions (i.e.: normal, beta, chi-square, binomial), do you think it should be changed there as well?
If so - could you please help with this change / propose how to do it?
Thanks upfront.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a good suggestion here :( I looked in the code but can't find other examples. I will look further and comment here if I find something.
Adding the Poisson distribution, which is central to many statistical procedures (https://en.wikipedia.org/wiki/poisson_distribution) (#15798)
Test plan (adding unit-tests)
Following the diff template of: https://github.com/prestodb/presto/pull/11981/files
== RELEASE NOTES ==
General Changes
(like the beta_cds: https://prestodb.io/docs/current/release/release-0.215.html?highlight=beta_cds)