Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculations containing #REF! no longer throw exceptions from 1.26.0 #3453

Closed
1 of 8 tasks
ndench opened this issue Mar 13, 2023 · 8 comments · Fixed by #3467
Closed
1 of 8 tasks

Calculations containing #REF! no longer throw exceptions from 1.26.0 #3453

ndench opened this issue Mar 13, 2023 · 8 comments · Fixed by #3467

Comments

@ndench
Copy link
Contributor

ndench commented Mar 13, 2023

This is:

- [x] a bug report
- [ ] a feature request
- [x] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

Prior to version 1.26.0 calculating the value of a formula that contains a reference error (eg. =SUM(B1:#REF!)) would result in a

[PhpOffice\PhpSpreadsheet\Calculation\Exception]               
  Worksheet!B4 -> Cell coordinate can not be zero-length string

What is the current behavior?

From 1.26.0 until now (1.28.0) the result is 0 instead of an exception.

What are the steps to reproduce?

<?php

require __DIR__ . '/vendor/autoload.php';

$spreadsheet = new Spreadsheet();
$worksheet = $spreadsheet->getActiveSheet();

$worksheet->setCellValue('B4', '=SUM(B1:#REF!)');

// Throws Calculation\Exception when < 1.26.0
$value = $worksheet->getCell('B4')->getCalculatedValue();

// Returns 0 in 1.26.0+
\var_dump($value);

What features do you think are causing the issue

  • Reader
  • Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

I have tested with .xlsx, don't see why it wouldn't affect all other formats.

Which versions of PhpSpreadsheet and PHP are affected?

PHP 8.1.16
PHPSpreadsheet 1.26.0+ (until at least 1.28.0)

@oleibman
Copy link
Collaborator

Throwing an Exception certainly doesn't seem like the right action. Propagating #REF! seems like the right action; it is what Excel does. PhpSpreadsheet is not quite emulating that. SUM is one of those functions where Excel behaves differently depending on whether its string argument is a literal or a cell reference. So, =SUM(1,"x") will yield #VALUE! for both PhpSpreadsheet and Excel; but, if H1 contains X, =SUM(1,H1) will yield 1 for both. For completeness, =SUM(1,sheet2!A1), where sheet2 does not exist, will yield #REF! for both.

The case where PhpSpreadsheet and Excel are not the same is your case, when you provide a literal REF# as an argument. That seems like a very artificial case to me. That doesn't mean we shouldn't try to emulate Excel even in this case, and I will look into it.

@ndench
Copy link
Contributor Author

ndench commented Mar 13, 2023

Thanks for the response @oleibman, I had not realised that this was connected to #2870/#2902. It also, looks like #2902 was released in 1.24.0 but this issue is not present until 1.26.0, so maybe it's not connected?

I agree that propagating #REF! is the right action here because an exception is probably not ideal and neither is returning 0 for an invalid formula. Additionally, propagating #REF! is what excel does in this situation:

Screenshot 2023-03-14 085740

I had also not realised that providing #REF! as a string was a contrived example, as that appears to be what excel does when you delete the cells it's referencing, for example:

  • SUM(A2:H1) is replaced with SUM(#REF!) when deleting columns F-H
  • SUM(Sheet2!A1:Sheet2!H1) is replaced with SUM(#REF!B1:#REF!B4) when deleting Sheet2

However, having just tested both of the above cases using the script I posted above, I can confirm that PHPSpreadsheet returns 0 for both of them in 1.24.0 and 1.28.0.

I still believe this is incorrect since Excel will return #REF! for both these cases so maybe this is not a regression?

PS.

I can confirm that the literal string #REF! is written to the Excel file on save by extracting the .xslsx, so it's probably a case that needs to be handled?

xl/worksheets/sheet1.xml

In cell A1 I entered SUM(A2:H1) and then deleted the columns.
I then created Sheet2 and in cell A2 entered SUM(Sheet2!A1:Sheet2!H1) and then deleted Sheet2.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
           xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac xr xr2 xr3"
           xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac"
           xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision"
           xr:uid="{9C764240-3645-4F55-8702-5DB1513E66B0}">
    <dimension ref="A1:A2"/>
    <sheetViews>
        <sheetView tabSelected="1" workbookViewId="0">
            <selection activeCell="A3" sqref="A3"/>
        </sheetView>
    </sheetViews>
    <sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/>
    <sheetData>
        <row r="1" spans="1:1" x14ac:dyDescent="0.25">
            <c r="A1" t="e">
                <f>SUM(#REF!)</f>
                <v>#REF!</v>
            </c>
        </row>
        <row r="2" spans="1:1" x14ac:dyDescent="0.25">
            <c r="A2" t="e">
                <f>SUM(#REF!)</f>
                <v>#REF!</v>
            </c>
        </row>
    </sheetData>
    <pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>

@MarkBaker
Copy link
Member

MarkBaker commented Mar 13, 2023

I'll concur with propagating the Excel '#REF!' error.
I think it also needs some further tests to see what Excel returns when passed a '#REF!' error value as part of a SUM() argument set, not just as the range.

@oleibman
Copy link
Collaborator

@ndench I am not duplicating one of your results in Excel at all. If I take a blank sheet in Excel, and enter =SUM(A2:H1) in cell K1, it shows a value of 0, as expected. (Somewhat less expected, it changes the formula to =SUM(A1:H2), which I can sort of understand even if I didn't expect it; I'm not yet sure how PhpSpreadsheet handles that.) If I now delete columns D:F, cell K1 is now H1 and contains the formula =SUM(A1:E2), you said above it would contain SUM(#REF!). Perhaps I misunderstood your description, or perhaps you are on a different release of Excel (Microsoft 365 Apps for Enterprise for my observation).

I do mostly agree with your 'sheet delete' example. Excel shows the formula as you describe it; however, when you save the workbook, it is saved as SUM(#REF!:#REF!). I had not realized that it overwrote the formula in this case (and I'm not really sure why it finds it necessary to do so); I agree that this makes your report less contrived than I originally thought.

@MarkBaker
Copy link
Member

Somewhat less expected, it changes the formula to =SUM(A1:H2), which I can sort of understand even if I didn't expect it; I'm not yet sure how PhpSpreadsheet handles that.)

PhpSpreadsheet doesn't handle ranges unless they're top-left/bottom-right.
See Issue #3366
I've yet to explore exactly how Excel handles ranges that aren't top-left/bottom-right so that I can decide what needs doing in this case.

If I now delete columns D:F, cell K1 is now H1 and contains the formula =SUM(A1:E2), you said above it would contain SUM(#REF!). Perhaps I misunderstood your description, or perhaps you are on a different release of Excel (Microsoft 365 Apps for Enterprise for my observation).

I do mostly agree with your 'sheet delete' example. Excel shows the formula as you describe it; however, when you save the workbook, it is saved as SUM(#REF!:#REF!). I had not realized that it overwrote the formula in this case (and I'm not really sure why it finds it necessary to do so); I agree that this makes your report less contrived than I originally thought.

I did a PR related to setting '#REF!' on deleting rows/columns; but I can't find which PR it was... need to search when I'm more awake

@ndench
Copy link
Contributor Author

ndench commented Mar 14, 2023

My apologies, it seems my previous comment was riddled with typos. I was working through multiple cases in Excel and seem to have messed up my explanation. I certainly did not mean to use ranges that cross rows/columns.

Here are my examples again, explained correctly with my Excel files attached 😅

  1. In cell A1 write formula =SUM(B1:E1) -> this returns 0 as expected
  2. Create Sheet2
  3. In cell A2 of Sheet1 write formula =SUM(Sheet2!A1:D1) -> this returns 0 as expected
  4. Delete columns B:E in Sheet1 -> A1 now displays #REF!
  5. Delete Sheet2 -> A2 now displays #REF!

Excel file showing steps 1-3: ref-error-step1-3.xlsx
Excel file showing steps 4-5: ref-error-step4-5.xlsx

For reference, I'm using Office Professional Plus 2016 on Windows 10.


Regardless, I think we're largely agreed that PHPSpreadsheet should return #REF! when calculating these formulas. This is also just discussing reading Excel files, not writing them after deleting columns/sheets. I'm unsure if PHPSpreadsheet copies Excel and changes the formula to #REF! as well.


I think it also needs some further tests to see what Excel returns when passed a '#REF!' error value as part of a SUM() argument set, not just as the range.

I've tested with Excel a few different things.

  1. Delete parts of a range included in a SUM() -> =SUM(B1:E1) and delete columns C and D. The formula is updated to SUM(B1:C1)
  2. Manually enter =SUM(B1:#REF!) (I can't figure out a way for excel to make this formula on it's own) -> This returns #REF!
  3. Delete cells called out specifically in a SUM() -> SUM(B1,C1,D1) and delete column C. The formula is updated to SUM(B1,#REF!,C1)

@MarkBaker
Copy link
Member

MarkBaker commented Mar 14, 2023

enter =SUM(B1:#REF!) (I can't figure out a way for excel to make this formula on it's own)

=SUM(B1:INDIRECT("XYZ"))

If you store a cellRef (e.g. C1) as a string in cell A2, then set A1 to =SUM(B1:INDIRECT(A2))

Then delete row 2; this gives a nested '#REF!' for the INDIRECT() call that should trickle up: =SUM(B1:INDIRECT(#REF!))

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Mar 18, 2023
Fix PHPOffice#3453. User sets a valid formula (e.g. `=SUM(Sheet2!B1:Sheet2!B3)`), and then does something to invalidate the formula (e.g. delete Sheet2). Excel changes the formula to `SUM(#REF!:#REF!)` when the spreadsheet is saved; apparently someone thought this was a good idea. But PhpSpreadsheet (a) used to throw an Exception when it evaluated the formula, and (b) now gives a result of `0` when evaluating the formula. Neither is ideal. It would be better to propagate the `#REF!` error.

It is likely that more tests are needed, which is why I will keep this in draft status for a bit.
@oleibman
Copy link
Collaborator

@ndench @MarkBaker Please test and suggest additional tests for PR #3467. I think it handles everything discussed above, but there are probably other situations that need to be addressed.

oleibman added a commit that referenced this issue Mar 25, 2023
Fix #3453. User sets a valid formula (e.g. `=SUM(Sheet2!B1:Sheet2!B3)`), and then does something to invalidate the formula (e.g. delete Sheet2). Excel changes the formula to `SUM(#REF!:#REF!)` when the spreadsheet is saved; apparently someone thought this was a good idea. But PhpSpreadsheet (a) used to throw an Exception when it evaluated the formula, and (b) now gives a result of `0` when evaluating the formula. Neither is ideal. It would be better to propagate the `#REF!` error.

It is likely that more tests are needed, which is why I will keep this in draft status for a bit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants