-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xlrd: in2csv fails to load codepage 21010 xls files #859
Comments
Sample file from linked issues: https://www.dropbox.com/s/fubuqla710n64iz/passbookInq.xls |
(That file 404s.) @acook Can you provide a file that produces the error? Otherwise impossible to test. My macOS version of Excel (v15.36) doesn't create files like this. |
Ah I didn't realize the other file was missing, my mistake. Hmm, looks like GitHub refuses to allow me to upload an Further research on my end indicates the codepage 21010 XLS files may be generated by some other tool, not Mac Excel as I first assumed (since the test file I received came from a Mac user and I've had other similar issues with Mac Excel recently). (incidentally, the info in the spreadsheet is all fake generated data) |
So it's an upstream bug in xlrd (which we use to read XLS files), and there are existing issues: https://github.com/python-excel/xlrd/issues/218 |
It did seem to originate in Why does |
The |
Ah bummer. I think XLRD does allow the encoding to be specified, which may mitigate this issue. If the XLRD maintainers get back to me, I can see what there is to be done to fix the problem at that level. Otherwise, looking into a different backend for |
I added an |
This is a common issue in CSV parsers/Excel exporters apparently:
It seems to be generated from MacOS versions of Excel.
Basically, when encountering codepage 21010 it should interpret it as codepage 1200 (AKA UTF-16le).
Ideally this would be handled programmatically. However, even passing in
--encoding utf-16le
(or other variations) seem to have no effect onin2csv
, so it might be ignoring the encoding argument?The text was updated successfully, but these errors were encountered: