-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lda models load/save backward compatibility across Python versions #1039
Merged
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
a4d214f
Modified load/save methods to maitain compatibility in loading and sa…
anmolgulati 04a4634
Added saved LDA models in Python 2.7 and 3.5 environments for testing…
anmolgulati aaae5ff
Added test for LDA Model compatibility between Python versions
anmolgulati 8b2cc42
Modified unpickle method to allow unpickling python 2 objects in pyth…
anmolgulati c4c1289
Created and saved LDAModels with same random_seed in both Python 2.7 …
anmolgulati 8fb383c
* Fixed PEP8 fixes.
anmolgulati 99cd080
Removed old LDA model files
anmolgulati 66d5f5e
Merge remote-tracking branch 'rare/develop' into lda-pickle-worker
anmolgulati 2ecde2c
Fixed numpy as np in test_ldamodel.py
anmolgulati 237eff4
Recreated lda model files in python 3.5
anmolgulati 35f2dcc
Added id2word in 'Separately' and created lda models again
anmolgulati 5b606e1
Pickling id2word Dictionary separately. Also added test to check equa…
anmolgulati 3937e62
Removed commented code
anmolgulati dac55bc
Minor change.
anmolgulati 615b91e
Changes made
anmolgulati b24620f
Refactored testfile() function
anmolgulati File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely not! What is this
latin1
?The content is (and should be read as) binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works as a fix for when loading objects in Python 3 which were pickled in Python 2, which gives an exception.
Basically, Python 3 attempts to convert the pickled py2 object into a str object, when we need it to be bytes and gives an exception. I used the latin1 encoding for as a work around for that. (Asked on Stackoverflow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a code comment to explain this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this hack needs to be marked and explained thoroughly in a comment.
I'm not familiar with such py2/py3 pickling work arounds, but isn't there a cleaner way to achieve the same effect? This sticks out like a sore thumb. @tmylk @anmol01gulati
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@piskvorky Umm, I had actually searched quite a lot, and tried various things on my system. This is the only way(a hack actually), I found, through which it works. By the way, I felt, we would not want to have this functionality in the future and could do away with the backward compatibility, if majority of the users shift to one Python 3 later (it's not the case right now though).
I'll open up a new PR to add a comment in the code though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am coding something entirely different and this solution is the only thing that worked for loading python2 pickles in python3... The creators claim that pickle is backwards compatible but apparently only if I pass latin1... Any other way just breaks and burns.