Fix out-of-bounds when converting from FixedDecimal to PluralOperands #290

sffc · 2020-10-02T22:04:27Z

FixedDecimal support numbers that are very large or very small. However, PluralOperands only supports numbers with integer and fraction parts that fit inside a u64. Discuss how to resolve cases when the FixedDecimal is larger than what the PluralOperands can support.

filmil · 2020-10-02T23:00:39Z

Perhaps `Try` -> TryFrom` is all that is needed. As an optimization, FixedDecimal could provide a flag that is set if any of its components would overflow u64. Another possible option, is that in the context of plural operands, not all digits are actually significant. Perhaps we could make do by keeping in PluralOperands only those digits that actually matter for plural formation. For example, in Serbian 12345689110 and 110 are equivalent as far as plurality is concerned.

…

On Fri, Oct 2, 2020 at 3:04 PM Shane F. Carr ***@***.***> wrote: FixedDecimal support numbers that are very large or very small. However, PluralOperands only supports numbers with integer and fraction parts that fit inside a u64. Discuss how to resolve cases when the FixedDecimal is larger than what the PluralOperands can support. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#290>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAB4GMBHPZG6H54GAV7GF3DSIZE7PANCNFSM4SCGJVAA> .

sffc · 2020-10-02T23:09:54Z

I lean toward saying that we just throw away digits that are out of bounds. We could cap the number of digits at 18 integer and 18 fraction or something like that.

zbraniecki · 2020-10-03T04:40:23Z

In stdlib those are usually having lossy in the name. I'd suggest tryfrom and lossy and or panicking

sffc · 2020-10-03T04:52:54Z

It's technically lossy, but for all real situations, keeping 18 digits on both side of the decimal separator is far more than enough to run plural rules. Panicking is definitely not the right behavior. I'm not a fan of TryFrom because this is a hot code path. I suppose we could add TryFrom in addition to From (if that's allowed, given that TryFrom has a default implementation based on From).

zbraniecki · 2020-10-03T07:20:57Z

It's technically lossy, but for all real situations, keeping 18 digits on both side of the decimal separator is far more than enough to run plural rules.

Sure, so then calling it from_fixed_decimal_lossy gives you:
a) correct description
b) communicates that it is lossy
c) appropriate overhead/perf

It's the same as From but with additional bit of information about lossiness.

sffc · 2020-10-03T09:14:53Z

Do you suggest we do not implement From<FixedDecimal> for PluralOperands and instead implement a from_fixed_decimal_lossy function?

zbraniecki · 2020-10-03T17:35:10Z

Hmm, I'm torn. I see that clamping to 18 digits may be seen not as lossy here, but sort of loss of precision much like u64 as u32 is ot isize as usize is, rather than what String::*_lossy_* would do by potentially altering the string to cut out malformed chars.

And I understand the convenience value of using From here rather than just method name.

My only hiccup is that I'd like us to be able to have a path forward if later we want to add a fallible version because in some scenario we need to handle the overflow differently.

If having both, TryFrom and From is not possible, then
maybe then we could add a fallible constructor like try_from_fixed_decimal?

filmil · 2020-10-03T18:25:53Z

Hmm, I'm torn. I see that clamping to 18 digits may be seen not as lossy here, but sort of loss of precision much like u64 as u32 is ot isize as usize is, rather than what String::*_lossy_* would do by potentially altering the string to cut out malformed chars.

FWIW, I don't think this conversion is actually lossy: it does remove information of the original exact number, but that does not lose information about its plurality. It is a one-way transform: PluralOperands is as much lossy to FixedDecimal as PluralCategory is to PluralOperands. You can't recover 23 from PluralCategory::Few, but we don't call the transform select_lossy.

(By taking the least significant digits from both the integer and fractional part, we will probably remove any distinction between the "full" and "lossy" conversions.)

zbraniecki · 2020-10-05T07:48:33Z

I'm convincend. From should work then and we shouldn't worry about lossiness.

sffc · 2020-10-06T14:02:38Z

One other thought I had. We don't want overflowed PluralOperands to select =0 and =1 rules. For example, if you have 1e20+1, you don't want to get the =1 rule.

At what point in the process do you think clients are intended to support =0 and =1?

zbraniecki · 2020-10-06T14:05:45Z

At what point in the process do you think clients are intended to support =0 and =1?

As in MessageFormat variant matching?

sffc added T-bug Type: Bad behavior, security, privacy discuss Discuss at a future ICU4X-SC meeting C-pluralrules Component: Plural rules labels Oct 2, 2020

sffc mentioned this issue Oct 3, 2020

Limit magnitudes of From<FixedDecimal> for PluralOperands to prevent overflow #293

Merged

sffc closed this as completed in #293 Oct 6, 2020

zbraniecki removed the discuss Discuss at a future ICU4X-SC meeting label Oct 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix out-of-bounds when converting from FixedDecimal to PluralOperands #290

Fix out-of-bounds when converting from FixedDecimal to PluralOperands #290

sffc commented Oct 2, 2020

filmil commented Oct 2, 2020 via email

sffc commented Oct 2, 2020

zbraniecki commented Oct 3, 2020

sffc commented Oct 3, 2020

zbraniecki commented Oct 3, 2020

sffc commented Oct 3, 2020

zbraniecki commented Oct 3, 2020

filmil commented Oct 3, 2020 •

edited

Loading

zbraniecki commented Oct 5, 2020

sffc commented Oct 6, 2020

zbraniecki commented Oct 6, 2020

Fix out-of-bounds when converting from FixedDecimal to PluralOperands #290

Fix out-of-bounds when converting from FixedDecimal to PluralOperands #290

Comments

sffc commented Oct 2, 2020

filmil commented Oct 2, 2020 via email

sffc commented Oct 2, 2020

zbraniecki commented Oct 3, 2020

sffc commented Oct 3, 2020

zbraniecki commented Oct 3, 2020

sffc commented Oct 3, 2020

zbraniecki commented Oct 3, 2020

filmil commented Oct 3, 2020 • edited Loading

zbraniecki commented Oct 5, 2020

sffc commented Oct 6, 2020

zbraniecki commented Oct 6, 2020

filmil commented Oct 3, 2020 •

edited

Loading