-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoders error when dealing with Chinese text #15
Comments
Thanks for the report, @BluSteve. I'll take a look until the end of the week, likely over the weekend. Cheers! |
Hey @BluSteve, unfortunately, I cannot reproduce the issue. I tried to fix the issue blindly by replacing "chardet" with the hardcoded "utf8" encoding. Would you mind trying the branch remove-chardet and telling me if it works for you? Before testing, please don't forget to remove all files from the |
I tried doing the same
|
It looks like it's a Poetry issue this time. It's been reported here, and there is a workaround, provided both in the comment and in this StackOverflow answer. As suggested in the workaround, please try exporting dependencies and installing them with pip.
|
That fix for poetry didn't work for me for whatever reason, but a simple cache clear did the trick. 🤷 Good news! The remove-chardet branch works fine with special characters. I've also checked master again, both with the exact same environment, and the master branch returned the same error as before. The issue seems to be fixed with remove-chardet. Thank you! |
Yay 🎉 Thanks for bringing good news. The PR is merged and a new version v2.2.0 is released. |
Hi all,
http://127.0.0.1:8000/api/v2/external_sources?query=envisage&src=en&dst=zh
https://www.linguee.com/english-chinese/translation/envisage.html
Try translating "envisage" to Chinese. On heroku it works perfectly fine but when I install it with poetry using the instructions in /docs, I get an error. The local install works fine for most other languages, as far as I can tell, but bugs on Chinese due to some encoding problem. English to Swedish (sv) bugs as well with the error in a different place.
en to zh:
en to sv:
I've tried hardcoding 'utf-8' in line 20 of file_cache.py and line 10 of utils.py to no avail.
page_html
in parsers.py (line 55 gives error) is an empty string.Two Windows 10 computers are giving the same behavior. I have not tried it with Linux yet.
The text was updated successfully, but these errors were encountered: