-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep Calculated String Results Below 32K #2921
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is the result of an investigation into issue PHPOffice#2884 (see also PR PHPOffice#2913). It is, unfortunately, not a fix for the original problem; see the discussion in that PR for why I don't think there is a practical fix for that specific problem at this time. Excel limits strings to 32,767 characters. We already truncate strings to that length when added to the spreadsheet. However, we have been able to exceed that length as a result of the concatenation operator (Excel truncates); as a result of the CONCATENATE or TEXTJOIN functions (Excel returns #CALC!); or as a result of the REPLACE, REPT, SUBSTITUTE functions (Excel returns #VALUE!). This PR changes PhpSpreadsheet to return the same value as Excel in these cases. Note that Excel2003 truncates in all those cases; I don't think there is a way to differentiate that behavior in PhpSpreadsheet. However, LibreOffice and Gnumeric do not have that limit; if they have a limit at all, it is much higher. It would be fairly easy to use existing settings to differentiate between Excel and LibreOffice/Gnumeric in this respect. I have not done so in this PR because I am not sure how useful that is, and I can easily see it leading to problems (read in a LibreOffice spreadsheet with a 33K cell and then output to an Excel spreadsheet). Perhaps it should be handled with an additional opt-in setting. I changed the maximum size from a literal to a constant in the one place where it was already being enforced (Cell/DataType). I am not sure that is the best place for it to be defined; I am open to suggestions.
MarkBaker
reviewed
Jul 2, 2022
MarkBaker
reviewed
Jul 2, 2022
MarkBaker
reviewed
Jul 2, 2022
MarkBaker
added a commit
that referenced
this pull request
Jul 9, 2022
Note that this will be the last 1.x branch release before the 2.x release. We will maintain both branches in parallel for a time; but users are requested to update to version 2.0 once that is fully available. ### Added - Added `removeComment()` method for Worksheet [PR #2875](https://github.com/PHPOffice/PhpSpreadsheet/pull/2875/files) - Add point size option for scatter charts [Issue #2298](#2298) [PR #2801](#2801) - Basic support for Xlsx reading/writing Chart Sheets [PR #2830](#2830) Note that a ChartSheet is still only written as a normal Worksheet containing a single chart, not as an actual ChartSheet. - Added Worksheet visibility in Ods Reader [PR #2851](#2851) and Gnumeric Reader [PR #2853](#2853) - Added Worksheet visibility in Ods Writer [PR #2850](#2850) - Allow Csv Reader to treat string as contents of file [Issue #1285](#1285) [PR #2792](#2792) - Allow Csv Reader to store null string rather than leave cell empty [Issue #2840](#2840) [PR #2842](#2842) - Provide new Worksheet methods to identify if a row or column is "empty", making allowance for different definitions of "empty": - Treat rows/columns containing no cell records as empty (default) - Treat cells containing a null value as empty - Treat cells containing an empty string as empty ### Changed - Modify `rangeBoundaries()`, `rangeDimension()` and `getRangeBoundaries()` Coordinate methods to work with row/column ranges as well as with cell ranges and cells [PR #2926](#2926) - Better enforcement of value modification to match specified datatype when using `setValueExplicit()` - Relax validation of merge cells to allow merge for a single cell reference [Issue #2776](#2776) - Memory and speed improvements, particularly for the Cell Collection, and the Writers. See [the Discussion section on github](#2821) for details of performance across versions - Improved performance for removing rows/columns from a worksheet ### Deprecated - Nothing ### Removed - Nothing ### Fixed - Xls Reader resolving absolute named ranges to relative ranges [Issue #2826](#2826) [PR #2827](#2827) - Null value handling in the Excel Math/Trig PRODUCT() function [Issue #2833](#2833) [PR #2834](#2834) - Invalid Print Area defined in Xlsx corrupts internal storage of print area [Issue #2848](#2848) [PR #2849](#2849) - Time interval formatting [Issue #2768](#2768) [PR #2772](#2772) - Copy from Xls(x) to Html/Pdf loses drawings [PR #2788](#2788) - Html Reader converting cell containing 0 to null string [Issue #2810](#2810) [PR #2813](#2813) - Many fixes for Charts, especially, but not limited to, Scatter, Bubble, and Surface charts. [Issue #2762](#2762) [Issue #2299](#2299) [Issue #2700](#2700) [Issue #2817](#2817) [Issue #2763](#2763) [Issue #2219](#2219) [Issue #2863](#2863) [PR #2828](#2828) [PR #2841](#2841) [PR #2846](#2846) [PR #2852](#2852) [PR #2856](#2856) [PR #2865](#2865) [PR #2872](#2872) [PR #2879](#2879) [PR #2898](#2898) [PR #2906](#2906) [PR #2922](#2922) [PR #2923](#2923) - Adjust both coordinates for two-cell anchors when rows/columns are added/deleted. [Issue #2908](#2908) [PR #2909](#2909) - Keep calculated string results below 32K. [PR #2921](#2921) - Filter out illegal Unicode char values FFFE/FFFF. [Issue #2897](#2897) [PR #2910](#2910) - Better handling of REF errors and propagation of all errors in Calculation engine. [PR #2902](#2902) - Calculating Engine regexp for Column/Row references when there are multiple quoted worksheet references in the formula [Issue #2874](#2874) [PR #2899](#2899)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the result of an investigation into issue #2884 (see also PR #2913). It is, unfortunately, not a fix for the original problem; see the discussion in that PR for why I don't think there is a practical fix for that specific problem at this time.
Excel limits strings to 32,767 characters. We already truncate strings to that length when added to the spreadsheet. However, we have been able to exceed that length as a result of the concatenation operator (Excel truncates); as a result of the CONCATENATE or TEXTJOIN functions (Excel returns #CALC!); or as a result of the REPLACE, REPT, SUBSTITUTE functions (Excel returns #VALUE!). This PR changes PhpSpreadsheet to return the same value as Excel in these cases. Note that Excel2003 truncates in all those cases; I don't think there is a way to differentiate that behavior in PhpSpreadsheet.
However, LibreOffice and Gnumeric do not have that limit; if they have a limit at all, it is much higher. It would be fairly easy to use existing settings to differentiate between Excel and LibreOffice/Gnumeric in this respect. I have not done so in this PR because I am not sure how useful that is, and I can easily see it leading to problems (read in a LibreOffice spreadsheet with a 33K cell and then output to an Excel spreadsheet). Perhaps it should be handled with an additional opt-in setting.
I changed the maximum size from a literal to a constant in the one place where it was already being enforced (Cell/DataType). I am not sure that is the best place for it to be defined; I am open to suggestions.
This is:
Checklist:
Why this change is needed?
Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.