-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast inference on CPU #29
Comments
Hi Ankit, yes we are working on speed improvements, but most will not be included in the upcoming release 0.2 (in a few days) which prioritizes GPU inference. One thing we can include already are smaller models that trade off small amounts of accuracy for greater CPU inference speed. For instance, while the default NER model currently takes 11 seconds for 500 words on my CPU-only laptop, the small model takes only 3 seconds. we measure an F1 score of 92.61 for the small model, which is still state-of-the-art, but a bit below the full model at 93.18. Would such models be helpful to you? What kind of CPU inference speeds do you require? |
Hi Alan, Thank you for your reply. 3 seconds for 500 words is about ideal. Most of the text we will be processing will be under 100 words so sub 500 ms speed will be ideal. |
That's great - we'll add the first batch of CPU models to the upcoming release! |
Release 0.2 adds pre-trained models that are more CPU friendly. Add '-fast' to model name to get models (listed here, only for English models at present). |
Hi @alanakbik , is there a way to use multiple CPU Cores to speed up the inferencing? Thank you very much. |
@mstaschik good question - there is currently no in-built way, but we'd be very interested in any ideas for making it faster on CPU! |
Is there any way to speed up? My version is the latest version, and then the NER is executed by the document command in Readme. Is there any way to speed up? |
Depends of the representation you use. Representation based on BI-LSTM are slow on CPU. May be others like BytePairEmbeddings may better work? (will be more rapid for sure) |
I'm follow the instruction to get entity, and how to speed up? @pommedeterresautee
|
@PlayDeep you can use the -fast variants of the models, i.e. # load the NER tagger
tagger = SequenceTagger.load('ner-fast') Also, you should not predict on sentences one by one, but always pass lists of sentences and set the mini_batch_size to a value that works on your machine. sentences = tagger.predict(list_of_sentences, mini_batch_size=16) |
Does ner-fast affect the f1 value? |
It probably will, the representations used are lighter. |
Yes, slightly, the evaluation numbers are listed here: Interestingly, |
Hi Alan,
Feel free to deprioritize it but currently, the inference is slow on CPUs. In a separate ticket, you did implement batching to improve the inference for long text but it still cannot be used in production settings.
The text was updated successfully, but these errors were encountered: