-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Csv Handling of Booleans (and an 8.1 Deprecation) #2232
Conversation
PhpSpreadsheet writes boolean values to a Csv as null-string/1, and treats input values of 'true' and 'false' as if they were strings. On the other hand, Excel writes boolean values to a Csv as TRUE/FALSE, and case-insensitively treats a matching string as boolean on read. This PR changes PhpSpreadsheet to match Excel. A side-effect of this change is that it fixes behavior incorrectly reported as a bug in PR PHPOffice#2048. That issue was closed, correctly, as user error. The user had altered Csv Writer, including adding ```declare(strict_types=1);```; that declaration was the cause of the error. The "offending" statements, calls to strpbrk and str_replace, will now work correctly whether or not strict_types is in use. And, just as I was getting ready to push this, the dailies for PHP 8.1 introduced a change deprecating auto_detect_line_endings. Csv Reader uses that setting; it allows it to process a Csv with Mac line endings, which happens to be something that Excel can do. As they say in https://wiki.php.net/rfc/deprecations_php_8_1, where the proposal passed without a single dissenting vote, "These newlines were used by “Classic” Mac OS, a system which has been discontinued in 2001, nearly two decades ago. Interoperability with such systems is no longer relevant." I tend to agree, but I don't know that we're ready to pull the plug yet. I don't see an easy way to emulate that functionality. For now, I have silenced the deprecation notices with at signs. I have also added a test case which will fail when support for that setting is pulled; this will give time to consider alternatives.
This could be interesting. It doesn't like not handling an error condition for ini_set. Let's see if this satisfies it.
Hi @oleibman, I just stumbled upon this after seeing unit tests in my project fail after updating PHPSpreadsheet to 1.19.0. Thanks for looking into this again! |
My apologies for the inconvenience. PhpSpreadsheet CSV Reader had not been handling booleans correctly before. I think it is possible that you needed StringValueBinder for that reason, and that you can drop it now (using the default value binder, you get TRUE/FALSE as expected). It is also possible that we need to cover this situation as an extra option in the new StringValueBinder; I need to think about that. In the meantime, if default value binder won't do, the following workaround should work for both 18 and 19: class StringValueBinder2 extends StringValueBinder {
public function bindValue(Cell $cell, $value)
{
if (!method_exists($this, 'setBooleanConversion')) {
return parent::bindValue($cell, $value);
}
if (is_object($value)) {
return $this->bindObjectValue($cell, $value);
}
// sanitize UTF-8 strings
if (is_string($value)) {
$value = StringHelper::sanitizeUTF8($value);
}
if ($value === null && $this->convertNull === false) {
$cell->setValueExplicit($value, DataType::TYPE_NULL);
} elseif (is_bool($value) && $this->convertBoolean === false) {
$cell->setValueExplicit($value, DataType::TYPE_BOOL);
} elseif ((is_int($value) || is_float($value)) && $this->convertNumeric === false) {
$cell->setValueExplicit($value, DataType::TYPE_NUMERIC);
} elseif (is_string($value) && strlen($value) > 1 && $value[0] === '=' && $this->convertFormula === false) {
$cell->setValueExplicit($value, DataType::TYPE_FORMULA);
} else {
if (is_bool($value)) {
$value = $value ? 'TRUE' : 'FALSE';
}
if (is_string($value) && strlen($value) > 1 && $value[0] === '=') {
$cell->getStyle()->setQuotePrefix(true);
}
$cell->setValueExplicit((string) $value, DataType::TYPE_STRING);
}
return true;
}
} Then use your own StringValueBinder2 rather than StringValueBinder. This overrides only method bindValue, and only by using the |
Hi @oleibman, thanks for the quick reply! Using a custom value binder is what I've been thinking about now, too, but I fear that this approach (e.g. with your suggested Thank a lot! |
Hi @oleibman, I just realized that I posted my comment above to a closed/merged PR. Should I instead maybe post this as a new issue to get more eyes/potential ideas on this? Thanks for your guidance! Best wishes, |
This is good enough for now. |
See the discussion in PR PHPOffice#2232 which came about 3 months after it was merged. It caused a problem in an unusual situation which did not come to light until the change was part of the new release version. The original PR changed PhpSpreadsheet's behavior to match Excel's for (not case sensitive) strings `TRUE` and `FALSE`. Excel treats the values as boolean, and now so does PhpSpreadsheet. When StringValueBinder is used, this becomes a tricky situation. The user wants the original strings preserved, including the case of all the letters. This PR changes the behavior of CSV reader as follows: - If StringValueBinder is not in effect, convert to boolean. - If StringValueBinder (actually any binder with method getBooleanConversion) is in effect, and the result of getBooleanConversion is true (which is the default in StringValueBinder), leave the value coming out of Csv Reader as the unchanged string. - Otherwise, convert to boolean. This should mean that there are no regression problems with StringValueBinder, while allowing PhpSpreadsheet to continue to match Excel in the default situation. No new settings are required.
…text that looks like boolean in to Boolean (see PHPOffice#2232 (comment) ).
See the discussion in PR #2232 which came about 3 months after it was merged. It caused a problem in an unusual situation which did not come to light until the change was part of the new release version. The original PR changed PhpSpreadsheet's behavior to match Excel's for (not case sensitive) strings `TRUE` and `FALSE`. Excel treats the values as boolean, and now so does PhpSpreadsheet. When StringValueBinder is used, this becomes a tricky situation. The user wants the original strings preserved, including the case of all the letters. This PR changes the behavior of CSV reader as follows: - If StringValueBinder is not in effect, convert to boolean. - If StringValueBinder (actually any binder with method getBooleanConversion) is in effect, and the result of getBooleanConversion is true (which is the default in StringValueBinder), leave the value coming out of Csv Reader as the unchanged string. - Otherwise, convert to boolean. This should mean that there are no regression problems with StringValueBinder, while allowing PhpSpreadsheet to continue to match Excel in the default situation. No new settings are required.
PhpSpreadsheet writes boolean values to a Csv as null-string/1, and treats input values of 'true' and 'false' as if they were strings. On the other hand, Excel writes boolean values to a Csv as TRUE/FALSE, and case-insensitively treats a matching string as boolean on read. This PR changes PhpSpreadsheet to match Excel.
A side-effect of this change is that it fixes behavior incorrectly reported as a bug in PR #2048. That issue was closed, correctly, as user error. The user had altered Csv Writer, including adding
declare(strict_types=1);
; that declaration was the cause of the error. The "offending" statements, calls to strpbrk and str_replace, will now work correctly whether or not strict_types is in use.And, just as I was getting ready to push this, the dailies for PHP 8.1 introduced a change deprecating auto_detect_line_endings. Csv Reader uses that setting; it allows it to process a Csv with Mac line endings, which happens to be something that Excel can do. As they say in https://wiki.php.net/rfc/deprecations_php_8_1, where the proposal passed without a single dissenting vote, "These newlines were used by “Classic” Mac OS, a system which has been discontinued in 2001, nearly two decades ago. Interoperability with such systems is no longer relevant." I tend to agree, but I don't know that we're ready to pull the plug yet. I don't see an easy way to emulate that functionality. For now, I have silenced the deprecation notices with at signs. I have also added a test case which will fail when support for that setting is pulled; this will give time to consider alternatives.
This is:
Checklist:
Why this change is needed?