Initial work enabling Excel function implementations for handling arrays as arguments when used in "array formulae". #2562
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is:
Checklist:
Why this change is needed?
See Issue #2551 for details
Currently, the PhpSpreadsheet function implementations ignore array arguments, simply extracting the first value from any array and using that value as the argument. This can give erroneous results, as in the example of the array formula
=MAX(ABS({-3, 2.5, -1; 0, -1, -12}))
where theABS()
function takes only the-3
value from the array, giving a final result from this formula as3
, when the correct result for the formula should actually be12
.This change modifies all of the relevant Excel function implementations so that they can work correctly with the MS Excel handling of array arguments passed to functions, and actually process the arrays rather than reducing them to a single value before processing that value.
How Excel handles array arguments passed to functions
Single array argument
When a single array argument is passed to a function, whether it is a row vector, a column vector, or a matrix, the function will return an array with the same dimensions.
So passing a 2x2 matrix argument for year to the
DATE()
function (=DATE({2020,2021;2022,2023}, 1, 1)
) will result in a 2x2 matrix being returned ({43831, 44197; 44562, 44927}
): A row vector with three elements will return a row vector with three results; a column vector with three elements will return a column vector with three results.Two array arguments
It gets a bit more complicated when more than one array argument is passed to a function, and the size of the resulting array depends on the sizes and types of arguments passed in.
Row vector and Column vector:
Will return a matrix, with as many rows as the row vector, and as many columns as the column vector; so passing a 3 element row vector and a 4 element column vector will result in a 3 row x 4 column matrix.
Row vector and Row vector
Will return a row vector containing as many columns as the smaller of the two vectors; so passing a 3 column row vector and a 5 column row vector will result in a 3 column row vector.
Column vector and Column vector
Will return a column vector containing as many rows as the smaller of the two vectors; so passing a 2 row column vector and a 5 row column vector will result in a 2 row column vector.
Row Vector and Matrix
Will return a result with as many rows as the matrix; but only as many columns as the smaller of the columns in the matrix or the columns in the row vector. So passing a 4 row x 3 column matrix and a 5 column row vector will result in a 4 row by 3 column matrix, while passing a 5 row by 5 column matrix and a 2 column row vector will result in a 5 row by 2 column matrix.
Column vector and Matrix
Will return a result with as many columns as the matrix; but only as many rows as the smaller of the rows in the matrix or the rows in the column vector. So passing a 4 row x 3 column matrix and a 5 row column vector will result in a 4 row by 3 column matrix, while passing a 5 row by 5 column matrix and a 3 row column vector will result in a 5 row by 3 column matrix.
Matrix and Matrix
Will return a result with the small of each dimension from the two matrices; so a 5x2 matrix and a 3x8 matrix will result in a 3x2 matrix
Three or more array arguments
MS Excel gets even more complicated and starts dipping into the realms of non-euclidian geometry (and there lies madness akin to trying to parse HTML markup using regular expressions) when returning n-dimensional arrays of results: in part, MS Excel supports 3-dimensional matrices across multiple worksheets; but that adds a lot of complexity in the Calculation Engine, and introduces a whole new series of inconsistencies.
One approach would have been to flatten the n-dimensional result down to a 2-dimensional result, which would be correct in some cases, but not in others.
So (for the moment) I've decided to throw an Exception if a function is called with more than two array arguments.
This has only one caveat: an array argument that only contains a single value (a 1x1 matrix) is not treated as an array, but as a simple scalar value; so it is not included in this count.