-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem deserializing Tokenizer on Windows (spaCy 2.0.3) #1634
Comments
Thanks for the report! It looks like something is going wrong when deserializing the tokenizer:
In any case, it looks like there might be a problem with the serialization of the tokenizer on Windows. Will look into this! To help us debug: Are you using any custom tokenization rules? |
Thanks for your quick answer,
|
Thanks – definitely looks like a serialization bug then. The tests for this are currently incomplete, because the output of |
I built a model a week ago and successfully loaded it from my Not sure what updated, I didn't run any pip installs in quite a while, but suddenly I get the same error when using |
Just to add another data point, we're seeing the same issue with |
Strangely enough, a third computer was able to use the same model... Trying to figure out how machine 1 and 3 match and 2 is different, I'll update the thread if I come up with something. |
Anyone find a solution without adding a new data point/re training the model on the computer? |
Not me, but coming back to this thread I just thought of something... in my case I'm putting the models in source control (git), so maybe the auto-handling of LF/CRLF characters is messing up the files? The machines where the models failed for us aren't mine so I can't check what their settings look like, but I'll ask the people who own them to check and try with different settings (basically, check-out as-is, commit as-is). |
Yep, in my case that was the problem! I fixed it by adding a
That "unsets" the I guess one last thing to consider, is that the files might have been changed by git at commit time, in which case the model might need to be retrained and commited again after adding the |
Adds guidance on what to do if users encounter the error described in [1634](explosion#1634), which probably only happens in Windows environments.
Adds guidance on what to do if users encounter the error described in [1634](explosion#1634), which probably only happens in Windows environments.
Adds guidance on what to do if users encounter the error described in [1634](explosion#1634), which probably only happens in Windows environments.
Adds guidance on what to do if users encounter the error described in [1634](#1634), which probably only happens in Windows environments.
Adds guidance on what to do if users encounter the error described in [1634](explosion#1634), which probably only happens in Windows environments.
Adds guidance on what to do if users encounter the error described in [1634](explosion#1634), which probably only happens in Windows environments.
Hey, I too faced the same issue and this is what fixed me. Follow below steps to resolve the issue in windows platform:
|
tl;dr: run The issue here is that the To stop this happening we're now switching our dependencies to our own fork of |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Hi,
When I train a model with spaCy 2.0.3 on my environment 1, everything works well : I can save it, load it, use it.
However when I try loading it with environment 2, I get the following error :
Environment 1 : it works
Environment 2 : it doesn't work
'EN' models are installed on both, spaCy versions are the same, could it be because of Windows ? Or do you have any ideas why I get this error ?
Thanks a lot !
The text was updated successfully, but these errors were encountered: