You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I also wonder if it would be possible to do the loading also of the Word2Vec default embeddings lazily since that could take down the time when first executing using Embeddings. Would simplify testing and use in "downstream" packages which might only optionally use the embeddings.
In theory it is possible that rather than storing the array,
we could store some lazy array that is only instantiated when it is accessed.
We'ld still need to process the whole file to get the vocabulary.
For Word2Vec I don't think it would gain much as those we have in a binary format.
But for some of the others like FastText we have them in a text format,
and so parsing takes some time.
Note: this would not change how long using Embeddings takes.
As no embeddings are actually loaded when you do that -- you need to call load_embeddings before anything is loaded.
I won't have time to work on this any time soon but would review PRs
The text was updated successfully, but these errors were encountered:
in #24 @robertfeldt said
In theory it is possible that rather than storing the array,
we could store some lazy array that is only instantiated when it is accessed.
We'ld still need to process the whole file to get the vocabulary.
For Word2Vec I don't think it would gain much as those we have in a binary format.
But for some of the others like FastText we have them in a text format,
and so parsing takes some time.
Note: this would not change how long
using Embeddings
takes.As no embeddings are actually loaded when you do that -- you need to call
load_embeddings
before anything is loaded.I won't have time to work on this any time soon but would review PRs
The text was updated successfully, but these errors were encountered: