[Run] Fix Scientific Notation Errors in Calculator Plugin #28884

viggyd · 2023-09-28T21:24:13Z

Summary of the Pull Request

This PR allows the Calculator plugin in PowerToys Run to use scientific notation (e.g. 5e3 = 5000). The PR fixes a regression in PowerToys that was introduced in v0.69.1.

PR Checklist

Detailed Description of the Pull Request / Additional comments

Nothing else fixed.

Validation Steps Performed

I updated the Calculator unit tests to include various expressions using this notation and confirmed that they all pass.

After adding logic to replace `e` as a mathematical constant, there are bugs when trying to use expressions like `5e3`. This change parses the `<number>e<number>` format into expanded form to prevent replacement with the constant. Regex explanation: `(\d+\.*\d*)[eE](-*\d+) `(\d+\.*\d*)`: Match any number of digits, followed by an optional `.` and more optional digits. The expression is used to capture things like: `5.0`, `1.`, `1` before the `e` `[eE]`: Match either upper or lowercase `e` `(-*\d+)`: Capture an optional `-` sign as well as a number of digits

The new regex captures a wider variety of numbers. See this post for details on the regex used: https://stackoverflow.com/a/23872060

viggyd · 2023-09-28T21:39:12Z

@microsoft-github-policy-service agree

Using `[` didn't capture the expression properly. Had to use `(` instead.

ekcom

Suggestions to improve/clarify regex included, but otherwise looks good.

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs

Aradig26

Please add these fast

jaimecbernardo

Hi, Sorry for the time I took in reviewing this PR. I thought changes were still being discussed.
Thanks for opening the PR.

It looks good to me, but I think scientific notation should only be with the upper case "E".
The main reason being that currently there can be some ambiguous expressions with the constant "e",

What do you think?

jaimecbernardo · 2023-10-24T10:51:44Z

@viggyd , Please let me know what you think of the change I proposed (uppercase E only).
Trying to get this in before .75 and I'm closing out the merge window pretty soon.
If you're OK with the change, I can implement it as well.

viggyd · 2023-10-24T13:16:32Z

@viggyd , Please let me know what you think of the change I proposed (uppercase E only). Trying to get this in before .75 and I'm closing out the merge window pretty soon. If you're OK with the change, I can implement it as well.

Sorry, for some reason I wasn't getting notifications for this. There is one change in the regex to make because it could accept multiple decimal points.

I don't have strong opinions on the capital E, using lowercase is what I'm used to, but I agree it is ambiguous with other usage. I'll make it stricter to reduce that ambiguity. I can push the changes up this evening. If you need them sooner, you're welcome to make them also!

jaimecbernardo · 2023-10-24T14:29:32Z

There is one change in the regex to make because it could accept multiple decimal points.

Thanks @viggy . I'm not sure what you mean with the multiple decimal points. Anyway, I don't think this is ready to go in yet after a re-review, because it doesn't support different decimal or number grouping characters.
I've added a commit to support only uppercase E and to add the expression to the the RegValidExpressChar so that upper case E is actually accepted.

viggyd · 2023-10-24T20:45:46Z

There is one change in the regex to make because it could accept multiple decimal points.

Thanks @viggy . I'm not sure what you mean with the multiple decimal points. Anyway, I don't think this is ready to go in yet after a re-review, because it doesn't support different decimal or number grouping characters. I've added a commit to support only uppercase E and to add the expression to the the RegValidExpressChar so that upper case E is actually accepted.

Could you elaborate what you mean by different decimal or number grouping characters? I don't follow. I can add whatever additional changes are necessary

jaimecbernardo · 2023-10-24T21:25:51Z

Could you elaborate what you mean by different decimal or number grouping characters? I don't follow. I can add whatever additional changes are necessary

Hi @viggyd ,
There are regional formats where . is not the decimal point, but , is used instead. For example, "1.05" could be written like "1,05" on those countries.

With these languages the calculator plugin supports the decimal point of the locale. Here's one example:

There's even languages that use . as a group separator. Here's Germany(German), for example:

But with this PR, it assumes . will always be the decimal separator:

viggyd · 2023-10-29T16:14:46Z

Hi @viggyd , There are regional formats where . is not the decimal point, but , is used instead. For example, "1.05" could be written like "1,05" on those countries.

Are there any functions to get the decimal separator?

jaimecbernardo · 2023-10-30T09:41:43Z

Hi @viggyd , There are regional formats where . is not the decimal point, but , is used instead. For example, "1.05" could be written like "1,05" on those countries.

Are there any functions to get the decimal separator?

In NumberTranslator.cs there's references to culture.NumberFormat.NumberGroupSeparator and culture.NumberFormat.NumberGroupSeparator

htcfreek · 2023-11-14T18:58:50Z

@viggyd
Please check out the plugin settings for calculator as there is a setting to force English input format.

stefansjfw · 2023-12-11T12:49:21Z

@viggyd Any progress here? Do you need some more help?

viggyd · 2023-12-11T23:29:45Z

I'm sorry, I completely forgot about this issue. I'll be able to work on it this weekend, should have something that works by then

Previous regular expression did not allow decimal separators that were not ".". The new expression uses the culture info decimal separator instead, which should work better.

viggyd · 2023-12-17T18:48:32Z

Okay, I pushed my update that accounts for various decimal separators, sorry for the delay!

Jay-o-Way

Two details

Jay-o-Way · 2023-12-25T13:16:01Z

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs

+            if (CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator == ".")
+            {
+                p = string.Format(CultureInfo.CurrentCulture, p, @"\.");
+            }
+            else
+            {
+                p = string.Format(
+                    CultureInfo.CurrentCulture,
+                    p,
+                    CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator);
+            }


Technically, this can also be done with a ternary. But both would work.

I thought a ternary would take too much horizontal space because the CultureInfo.CurrentCulture... is a lot of characters so I left it at the expanded if/else instead

Jay-o-Way · 2023-12-25T13:16:28Z

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs

+                    CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator);
+            }
+
+            var r = "($1 * 10^($5))";


Why create a variable for a single use?

No particular reason, I was just following a similar format to other functions that are already in the file. I updated the code to remove the single use variable in my latest push

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs

stefansjfw · 2023-12-26T21:46:36Z

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs

@@ -18,6 +20,7 @@ public static class CalculateHelper
            @"sinh\s*\(|cosh\s*\(|tanh\s*\(|arsinh\s*\(|arcosh\s*\(|artanh\s*\(|" +
            @"pi|" +
            @"==|~=|&&|\|\||" +
+            @"((-?(\d+(\.\d*)?)|-?(\.\d+))[E](-?\d+\.*\d*))|" + /* expression from CheckScientificNotation between parenthesis */


should it be updated here as well? only integer after E

Whoops. Fixed!

thanks :) Change looks good to me, I'll just give it a test and then it'll be good to go! :)

Missed this expression in my last update. Whoops

stefansjfw · 2023-12-26T21:54:26Z

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs

+             * -?(\d+({0}\d*)?): Captures a decimal number starting with a number (e.g. "-1.23")
+             * -?({0}\d+): Captures a decimal number without leading number (e.g. ".23")
+             * E: Captures capital 'E'
+             * (-?\d+{0}?\d*): Captures an integer number (e.g. "-1" or "23")


we need a comment update here as well :D

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs

…lculator/CalculateHelper.cs

stefansjfw · 2023-12-26T22:49:19Z

Hm, it still accepts and processes decimal numbers after E. And it gives weird results:

not sure why

stefansjfw · 2023-12-26T22:53:31Z

Hm, it still accepts and processes decimal numbers after E. And it gives weird results:

not sure why

Actually it assumes multiplication between 1E1 and .3 -> 1E1 * 0.3 . Weird syntax and can confuse users.

viggyd · 2023-12-26T22:54:01Z

Looks like that's just something you can do? Definitely weird behavior:

viggyd · 2023-12-26T22:55:23Z

Hm, it still accepts and processes decimal numbers after E. And it gives weird results:
not sure why

Actually it assumes multiplication between 1E1 and .3 -> 1E1 * 0.3 . Weird syntax and can confuse users.

Any ideas on how best to deal with that? Is there a way to reject a pattern?

viggyd · 2023-12-26T23:03:30Z

Part of the problem is that there's a lot of other expressions that already allow expressions in parentheses followed by any number of spaces and it treats that as valid. So I can do something like (10) 3 and I'll get 30 as a result even though it looks funny.

Edit: Just checked and it doesn't even have to be in parentheses. 10 3 still results in 30

stefansjfw · 2023-12-26T23:27:57Z

Hm, it still accepts and processes decimal numbers after E. And it gives weird results:
not sure why

Actually it assumes multiplication between 1E1 and .3 -> 1E1 * 0.3 . Weird syntax and can confuse users.

Any ideas on how best to deal with that? Is there a way to reject a pattern?

looks like we accept this in general. I'm still worried that e.g. 1E1.5 might confuse users, because 10^1.5 is valid math expression, but E shouldn't accept decimal after (Windows calculator also prevent this when using exp).. Let's keep it as is as we respect the rules here.

there is another problem now. If I change the numbers format, as @jaimecbernardo mentioned above, to e.g. Portuguese:

Regex is correctly matched, but the result is wrong 🤔

viggyd · 2023-12-27T13:29:19Z

I see the problem. Ordinarily, the NumberTranslator converts the number/decimal separators so that by the time input reaches the calculator engine, it's already in English format (i.e. . for decimal separation). For some reason, the translator is having trouble parsing an expression like 2,3E2 and so it doesn't convert the expression to English format. The Mages engine PowerToys is using for calculating actual results requires English format and so it's not giving proper results.

That being said, I'm not sure why the translator isn't properly parsing the input. Seems like it's not recognizing the E as a token to split the expression on?

I can try to either:

Fix the translator so that the input we get when the user inputs scientific notation is in English and just remove the decimal separator logic in the calculator engine
Just make the calculator engine replace any decimal separators with the regular English separator.

I think I'd prefer option 1 since that seems cleaner. I'll need to look into the translator and see how to make it recognize scientific notation input properly.

stefansjfw · 2023-12-27T14:02:06Z

I agree. Let's try the option number 1

Jay-o-Way · 2023-12-27T14:12:48Z

Yeah, those different possibilities for decimals are always a trouble.

Important

Both "...just remove the decimal separator logic in the calculator engine" and "Just make the calculator engine replace any decimal separators" actually sound dangerous and could possibly lead to a regression resulting in bad calculations. Note that a number like "one-thousand two-hundred thirty four" in some languages is written as 1,234 and in some others that would be 1.234. In both cases, the other character denotes a decimal.

Whatever is going to be done, it needs to be tested with all permutations.

viggyd · 2023-12-27T21:54:26Z

Okay, I think the issue in the NumberTranslator's splitRegex pattern. It's considering something like 2,3E2 to be a single token and it isn't splitting on the E:

Because the regular expression is allowing characters [a-fA-F], it's not creating a token split on the E. So the token as a whole (2,3E2) is sent to the decimal.TryParse(...), which doesn't get translated to 2.3E2 presumably because the function doesn't understand the E in there. Interestingly, the same thing should happen for English decimal separators with the tokenization not working, but it works out there because there is no translation needed.

I'm assuming the allowance of a-fA-F is to allow hex notation, but I didn't think PowerToys supported that. Anyone have insight into why these characters are allowed to be part of a singular token? If PowerToys does support hexadecimal notation now, I'm not sure how to fix the translator. Are we forcing all hex numbers to start with 0x? If so, maybe I can look for that and split the token on E if there is no leading 0x. Looks like @jaimecbernardo added the code hexadecimal code. Thoughts?

Jay-o-Way · 2023-12-27T22:32:41Z

Darn it, you're absolutely right. In short, there is no way to tell if 5E3 is hexadecimal or scientific notation. Don't know how it would be possible to tell. Which jeopardizes the entire issue. This is why Calculator app has modes for input.

Having that said... Isn't scientific notation better to use for results, only? I mean, there's an inherent "insecurity" (or what's the word?) regarding significance. Like, when you say there are (about) 8 billion people on the planet, that's not an exact number. You could write it as 8B or 8E9. Note that both these are also valid as hexadecimal. What I mean is, you normally do not calculate anything with these kinds of estimates.

The hex regex stuff is only used once, there's no need to keep track of it in the object's state. Just create it during the translation process.

viggyd · 2023-12-28T13:44:23Z

Darn it, you're absolutely right. In short, there is no way to tell if 5E3 is hexadecimal or scientific notation. Don't know how it would be possible to tell. Which jeopardizes the entire issue. This is why Calculator app has modes for input.

Having that said... Isn't scientific notation better to use for results, only? I mean, there's an inherent "insecurity" (or what's the word?) regarding significance. Like, when you say there are (about) 8 billion people on the planet, that's not an exact number. You could write it as 8B or 8E9. Note that both these are also valid as hexadecimal. What I mean is, you normally do not calculate anything with these kinds of estimates.

I think maybe uncertainty is the word you're looking for?

While that certainly is a use case, I have used scientific notation in PowerToys in the past for the purposes of calculation. It's a convenience when working with smaller numbers, so that rather than needing to put in 0.001 I can just do 1e-3. It's especially useful when converting between various SI units, dimensional analysis, etc.

It looks like PowerToys does require the 0x prefix to denote a hexadecimal number. If I can assume that's always the case, I can do a 2-stage tokenization approach. The first can split out any hexadecimal expressions in the input. After that, all other tokens are free to be split without the a-fA-F in the splitting regex. The hexadecimals that I tokenized in the first stage can remain as one big group.

Something like (pseudocode):

string[] hex_norm_tokens = hexRegex.Split(input);

foreach (string token in hex_norm_tokens)
{
    if (token.startswith == "0x")
    {
         // Add to output builder - no further processing needed
         continue;
    }
    
    string[] reg_tokens = splitRegex.Split(token);
    // All current tokenization happens here
    ...
}

Should be able to split out hexadecimals on something like this: (?:(0x[\da-fA-F]+)+). Regex101 link with test expression: link

I tested this locally and its seems to work well. I'm wondering if I should update the unit tests as well. There's a few edge cases that we talked about during this conversation that'd be good to codify in unit tests somewhere to catch any regressions if they happen in the future

Jay-o-Way · 2023-12-28T16:06:04Z

Alrighty. If the 0x prefix is/becomes mandatory, we must make sure this clear in the documentations. But if it works, it works.

viggyd · 2024-01-02T12:38:35Z

Okay, pushed a change that now includes a unit test for scientific notation in other cultures. It's very small though, let me know if there are any other cases I should test. I couldn't think of anything else that wasn't covered by a different test somewhere else. I can also move the tests I added in QueryTests into this new one I added under ExtendedCalculatorParserTests

stefansjfw

This looks good now! Nice work! Thanks for the contribution! :)

Comments addressed

viggyd added 2 commits September 28, 2023 06:52

Update regex to be more tolerant of weird entries

6a11996

The new regex captures a wider variety of numbers. See this post for details on the regex used: https://stackoverflow.com/a/23872060

viggyd changed the title ~~Module/run/fix scientific notation~~ [Run] Fix Scientific Notation Errors in Calculator Plugin Sep 28, 2023

Fix regular expression failing unit tests

f6a4ef9

Using `[` didn't capture the expression properly. Had to use `(` instead.

ekcom approved these changes Sep 30, 2023

View reviewed changes

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs Outdated Show resolved Hide resolved

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs Outdated Show resolved Hide resolved

Aradig26 approved these changes Oct 20, 2023

View reviewed changes

jaimecbernardo previously requested changes Oct 24, 2023

View reviewed changes

Allow only for uppercase E

75d2903

Only allow one decimal in second grouping

d2e052e

Support various decimal separator types

0b9762a

Previous regular expression did not allow decimal separators that were not ".". The new expression uses the culture info decimal separator instead, which should work better.

This comment has been minimized.

Sign in to view

Merge branch 'main' into module/run/fix-scientific-notation

9335468

viggyd requested a review from jaimecbernardo December 21, 2023 12:31

Jay-o-Way reviewed Dec 25, 2023

View reviewed changes

stefansjfw reviewed Dec 25, 2023

View reviewed changes

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs Outdated Show resolved Hide resolved

viggyd added 3 commits December 26, 2023 16:36

Merge branch 'main' into module/run/fix-scientific-notation

e6a58ef

Only allow integers after E

dbb1398

Remove single use variable

816d653

stefansjfw reviewed Dec 26, 2023

View reviewed changes

Update regex to only accept integers after E

ae1f55d

Missed this expression in my last update. Whoops

stefansjfw reviewed Dec 26, 2023

View reviewed changes

src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Calculator/CalculateHelper.cs Outdated Show resolved Hide resolved

Update src/modules/launcher/Plugins/Microsoft.PowerToys.Run.Plugin.Ca…

c8f5210

…lculator/CalculateHelper.cs

viggyd added 2 commits December 28, 2023 08:38

Update NumberTranslator to parse hex as a whole

3893683

Remove hexRegex as object member

99dca23

The hex regex stuff is only used once, there's no need to keep track of it in the object's state. Just create it during the translation process.

viggyd added 2 commits January 2, 2024 07:36

Add unit tests for sci notation in other cultures

cd15af0

Merge branch 'main' into module/run/fix-scientific-notation

30597be

stefansjfw approved these changes Jan 3, 2024

View reviewed changes

stefansjfw merged commit 114f1b4 into microsoft:main Jan 3, 2024

jaimecbernardo mentioned this pull request Dec 4, 2024

[Launcher][Calculator]Allow scientific notation with lowercase 'e' #36187

Merged

7 tasks

[Run] Fix Scientific Notation Errors in Calculator Plugin #28884

[Run] Fix Scientific Notation Errors in Calculator Plugin #28884

Conversation

viggyd commented Sep 28, 2023

Summary of the Pull Request

PR Checklist

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

viggyd commented Sep 28, 2023

ekcom left a comment

Choose a reason for hiding this comment

Aradig26 left a comment • edited Loading

Choose a reason for hiding this comment

jaimecbernardo left a comment

Choose a reason for hiding this comment

jaimecbernardo commented Oct 24, 2023

viggyd commented Oct 24, 2023

jaimecbernardo commented Oct 24, 2023 • edited Loading

viggyd commented Oct 24, 2023

jaimecbernardo commented Oct 24, 2023

viggyd commented Oct 29, 2023

jaimecbernardo commented Oct 30, 2023

htcfreek commented Nov 14, 2023

stefansjfw commented Dec 11, 2023

viggyd commented Dec 11, 2023

viggyd commented Dec 17, 2023

This comment has been minimized.

Jay-o-Way left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefansjfw commented Dec 26, 2023

stefansjfw commented Dec 26, 2023

viggyd commented Dec 26, 2023

viggyd commented Dec 26, 2023

viggyd commented Dec 26, 2023 • edited Loading

stefansjfw commented Dec 26, 2023

viggyd commented Dec 27, 2023

stefansjfw commented Dec 27, 2023

Jay-o-Way commented Dec 27, 2023 • edited Loading

viggyd commented Dec 27, 2023

Jay-o-Way commented Dec 27, 2023 • edited Loading

viggyd commented Dec 28, 2023

Jay-o-Way commented Dec 28, 2023

viggyd commented Jan 2, 2024

stefansjfw left a comment

Choose a reason for hiding this comment

Aradig26 left a comment •

edited

Loading

jaimecbernardo commented Oct 24, 2023 •

edited

Loading

viggyd commented Dec 26, 2023 •

edited

Loading

Jay-o-Way commented Dec 27, 2023 •

edited

Loading

Jay-o-Way commented Dec 27, 2023 •

edited

Loading