Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of #REF! Errors in Subtotal, and More #2902

Merged
merged 4 commits into from
Jun 26, 2022

Conversation

oleibman
Copy link
Collaborator

This PR derives from, and supersedes, PR #2870, submitted by @ndench. The problem reported in the original is that SUBTOTAL does not handle #REF! errors in its arguments properly; however, my investigation has enlarged the scope.

The main problem is in Calculation, and it has a simple fix. When the calculation engine finds a reference to an uninitialized cell, it uses null as the value. This is appropriate when the cell belongs to a defined sheet; however, for an undefined sheet, #REF! is more appropriate.

With that fix in place, SUBTOTAL still needs a small fix of its own. It tries to parse its cell reference arguments into an array, but, if the reference does not match the expected format (as #REF! will not), this results in referencing undefined array indexes, with attendant messages. That assignment is changed to be more flexible, eliminating the problem and the messages.

Those 2 fixes are sufficient to ensure that the original problem is resolved. It also resolves a similar problem with some other functions (e.g. SUM). However, it does not resolve it for all functions. Or, to be more particular, many functions will return #VALUE! rather than #REF! if this arises, and the same is true for other errors in the function arguments, e.g. #DIV/0!. This PR does not attempt to address all functions; I need to think of a systematic way to pursue that. However, at least for most MathTrig functions, which validate their arguments using a common method, it is relatively easy to get the function to propagate the proper error result.

This is:

- [x] a bugfix
- [ ] a new feature
- [ ] refactoring
- [x] additional unit tests

Checklist:

Why this change is needed?

Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.

This PR derives from, and supersedes, PR PHPOffice#2870, submitted by @ndench. The problem reported in the original is that SUBTOTAL does not handle #REF! errors in its arguments properly; however, my investigation has enlarged the scope.

The main problem is in Calculation, and it has a simple fix. When the calculation engine finds a reference to an uninitialized cell, it uses `null` as the value. This is appropriate when the cell belongs to a defined sheet; however, for an undefined sheet, #REF! is more appropriate.

With that fix in place, SUBTOTAL still needs a small fix of its own. It tries to parse its cell reference arguments into an array, but, if the reference does not match the expected format (as #REF! will not), this results in referencing undefined array indexes, with attendant messages. That assignment is changed to be more flexible, eliminating the problem and the messages.

Those 2 fixes are sufficient to ensure that the original problem is resolved. It also resolves a similar problem with some other functions (e.g. SUM). However, it does not resolve it for all functions. Or, to be more particular, many functions will return #VALUE! rather than #REF! if this arises, and the same is true for other errors in the function arguments, e.g. #DIV/0!. This PR does not attempt to address all functions; I need to think of a systematic way to pursue that. However, at least for most MathTrig functions, which validate their arguments using a common method, it is relatively easy to get the function to propagate the proper error result.
@oleibman
Copy link
Collaborator Author

Yipe! Php80 is reporting error "Cannot use positional argument after named argument". This seems to be because SubTotal is using FlattenArrayIndexed, which creates an array of the form stringkey1=>value1 intkey2=>value2, and the string key is treated as a named argument, while the intkey is treated as positional. Changing SubTotal to use FlattenArray seems to work, but there are other functions which use FlattenArrayIndexed. This may well be a situation which has not existed in the test suite before this. Research is needed.

@oleibman
Copy link
Collaborator Author

Unsurprisingly, using FlattenArray rather than FlattenArrayFixed fixes the problems reported in this test, but causes regression problems with other Subtotal tests.

Problem with Php8.0+ - array passed to call_user_func_array must have int keys before string keys, otherwise Php thinks we are passing positional parameters after keyword parameters.

7 other functions use flattenArrayIndexed, but Subtotal is the only one which uses that result to subsequently pass arguments to call_user_func_array. So the others should not require a change. A specific test is added for SUM to validate that conclusion.
@oleibman oleibman merged commit a895721 into PHPOffice:master Jun 26, 2022
MarkBaker added a commit that referenced this pull request Jul 9, 2022
Note that this will be the last 1.x branch release before the 2.x release. We will maintain both branches in parallel for a time; but users are requested to update to version 2.0 once that is fully available.

### Added

- Added `removeComment()` method for Worksheet [PR #2875](https://github.com/PHPOffice/PhpSpreadsheet/pull/2875/files)
- Add point size option for scatter charts [Issue #2298](#2298) [PR #2801](#2801)
- Basic support for Xlsx reading/writing Chart Sheets [PR #2830](#2830)

  Note that a ChartSheet is still only written as a normal Worksheet containing a single chart, not as an actual ChartSheet.

- Added Worksheet visibility in Ods Reader [PR #2851](#2851) and Gnumeric Reader [PR #2853](#2853)
- Added Worksheet visibility in Ods Writer [PR #2850](#2850)
- Allow Csv Reader to treat string as contents of file [Issue #1285](#1285) [PR #2792](#2792)
- Allow Csv Reader to store null string rather than leave cell empty [Issue #2840](#2840) [PR #2842](#2842)
- Provide new Worksheet methods to identify if a row or column is "empty", making allowance for different definitions of "empty":
  - Treat rows/columns containing no cell records as empty (default)
  - Treat cells containing a null value as empty
  - Treat cells containing an empty string as empty

### Changed

- Modify `rangeBoundaries()`, `rangeDimension()` and `getRangeBoundaries()` Coordinate methods to work with row/column ranges as well as with cell ranges and cells [PR #2926](#2926)
- Better enforcement of value modification to match specified datatype when using `setValueExplicit()`
- Relax validation of merge cells to allow merge for a single cell reference [Issue #2776](#2776)
- Memory and speed improvements, particularly for the Cell Collection, and the Writers.

  See [the Discussion section on github](#2821) for details of performance across versions
- Improved performance for removing rows/columns from a worksheet

### Deprecated

- Nothing

### Removed

- Nothing

### Fixed

- Xls Reader resolving absolute named ranges to relative ranges [Issue #2826](#2826) [PR #2827](#2827)
- Null value handling in the Excel Math/Trig PRODUCT() function [Issue #2833](#2833) [PR #2834](#2834)
- Invalid Print Area defined in Xlsx corrupts internal storage of print area [Issue #2848](#2848) [PR #2849](#2849)
- Time interval formatting [Issue #2768](#2768) [PR #2772](#2772)
- Copy from Xls(x) to Html/Pdf loses drawings [PR #2788](#2788)
- Html Reader converting cell containing 0 to null string [Issue #2810](#2810) [PR #2813](#2813)
- Many fixes for Charts, especially, but not limited to, Scatter, Bubble, and Surface charts. [Issue #2762](#2762) [Issue #2299](#2299) [Issue #2700](#2700) [Issue #2817](#2817) [Issue #2763](#2763) [Issue #2219](#2219) [Issue #2863](#2863) [PR #2828](#2828) [PR #2841](#2841) [PR #2846](#2846) [PR #2852](#2852) [PR #2856](#2856) [PR #2865](#2865) [PR #2872](#2872) [PR #2879](#2879) [PR #2898](#2898) [PR #2906](#2906) [PR #2922](#2922) [PR #2923](#2923)
- Adjust both coordinates for two-cell anchors when rows/columns are added/deleted. [Issue #2908](#2908) [PR #2909](#2909)
- Keep calculated string results below 32K. [PR #2921](#2921)
- Filter out illegal Unicode char values FFFE/FFFF. [Issue #2897](#2897) [PR #2910](#2910)
- Better handling of REF errors and propagation of all errors in Calculation engine. [PR #2902](#2902)
- Calculating Engine regexp for Column/Row references when there are multiple quoted worksheet references in the formula [Issue #2874](#2874) [PR #2899](#2899)
@oleibman oleibman deleted the issue2870 branch July 23, 2022 23:18
oleibman added a commit to oleibman/PhpSpreadsheet that referenced this pull request Jun 30, 2024
Fix PHPOffice#2581 (not obvious - see next paragraph for explanation). This continues the work of PR PHPOffice#2902 (and also PR PHPOffice#3467) to have errors propagated through function calculations rather than treating them as strings. All text functions, and the concatenation operator, are addressed in this PR.

In the original issue, the spreadsheet being loaded uses the result of an unimplemented function as an argument to another function. When `getCalculatedValue` is used on the cell in question, the result is returned as `#VALUE!`. If the cell had just contained a function call to the unimplemented function, getCalculatedValue would have recognized the situation and returned oldCalculatedValue as the result. Not perfect, but good enough most of the time. User would like oldCalculatedValue returned here as well, which seems like a reasonable request.

PhpSpreadsheet always returns `#Not Yet Implemented` as the result for a function which it knows about but which is not yet implemented. That is the key to the `Cell` class being able to substitute oldCalculatedValue in the first place. However, in order to do that for the issue in question, that result has to be propagated to any functions for which the result is an argument. I don't want to add unimplemented to the list of known error codes, but I am willing to add a parameter to `ErrorValue::isError` to indicate whether that value should be considered an error (default is "no").

The first use of that new parameter would be by the text functions. They go through a common Helper routine, so it is pretty easily implemented. And, as it turns out, most of the text functions do not currently propagate errors, e.g. if A1 results in a value error, `=LEFT(A1,2)` will result in `#V` rather than `#VALUE!`. With this PR, they will now be handled correctly.
@oleibman oleibman mentioned this pull request Jun 30, 2024
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant