BUG: Handle floating point boundaries in qcut
#59409
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
doc/source/whatsnew/v3.0.0.rst
file if fixing a bug or adding a new feature.Panda's
quantile
uses NumPy'spercentile
, so it has to convert the quantile (between 0 and 1) to a percentile (between 0 and 100). However,np.percentile
itself uses NumPy's quantile code, sonp.percentile
converts from a percentile back into a quantile. Together, these two operations can introduce a slight floating point error when the quantile inputs are multiplied then divided by 100, which is not a power of 2. This PR changes Panda's quantile code to directly use NumPy's quantile code, which was not available when the original Pandas quantile code was written.This PR also changes how
qcut
picks quantiles when asked to split into a fixed number of quantiles. Some quantiles, such as 5/7, can't be represented as floats and have to be rounded to the nearest representable number. If that involves rounding down, then the level that defines the upper bound of the 5/7 quantile is incorrectly assigned to the 6/7 quantile. This PR changes the quantiles to round up rather than to the nearest when picking a floating point representation.