-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replaced self-implemented CSV file reader with Apache CSVParser for CLI #474
Conversation
fun readCsvWithDoubleQuotesEscape() { | ||
writeFile("data_with_double_quotes_escape.csv", "\"1,2\",2") | ||
|
||
val args = listOf("\"${dirPath("data_with_double_quotes_escape.csv")}\"", "{type:\"csv\"}").map { it.exprValue() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the user specify Ion symbols instead of strings? (I think they should.) Need at least one test for that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I mean Ion symbols for the value of type
in that struct, i.e. { type: csv }
.
Codecov Report
@@ Coverage Diff @@
## main #474 +/- ##
============================================
- Coverage 82.39% 82.38% -0.01%
Complexity 1406 1406
============================================
Files 171 171
Lines 10819 10816 -3
Branches 1782 1781 -1
============================================
- Hits 8914 8911 -3
Misses 1361 1361
Partials 544 544
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
fun readCsvWithDoubleQuotesEscape() { | ||
writeFile("data_with_double_quotes_escape.csv", "\"1,2\",2") | ||
|
||
val args = listOf("\"${dirPath("data_with_double_quotes_escape.csv")}\"", "{type:\"csv\"}").map { it.exprValue() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I mean Ion symbols for the value of type
in that struct, i.e. { type: csv }
.
This PR aims to resolve the issue on CLI reading CSV files reported in Issue #366
Problem Description
Originally, we were implementing the CSV file reader for CLI by ourselves. Our implementation does not follow the standard CSV format as defined in rfc4180, and it casues the following problems:
Solution Description
This PR migrates from the self-implemented CSV reader to a standard Apache CSVParser library. Thus, all the above problems except the last one are solved. Also, adopting this library can make it easier to realize more functionalities for CLI file reader, such as supporting reading files in other formats.
Changes Details
CSVParser
library to pares the CSV file.CSVPrinter
library to help print the CSV file.delimiter
fromString
intoChar
, since theCSVParser
library only accepts the delimiter asChar
.For reviewers:
First review 'DelimitedValues.kt', then 'DelimitedValuesTest.kt'. Other changes are minor.
It might take up to 1.5 hours to review all the changes.