-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DensePassageRetriever: Add Training, Refactor Inference to FARM modules #527
Conversation
just suggestion to include corresponding update in DPR tutorials and documentation as well in the task list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Good job! Let's merge it
@tholor, because the Retriever is initialized with a query_embedding_model and document_embedding_model, may I assume that retriever.train() will incrementally train on the set of documents given as input ?
Is there a sample data file for train_filename(i can use as a template) that is input to train() ? Is it the same as in https://github.com/facebookresearch/DPR/ : Receiver Input data format ? |
test_dpr_retirever
with correct test cases../saved_models/dpr-tutorial/lm1
../saved_models/dpr-tutorial/lm2
../saved_models/
Intializing DPR:
Note:
remove_sep_tok_from_untitled_passages
argument has been removed asTextSimilarityProcessor
inFARM
encodes untitled passages as[CLS] [SEP] [SEP] <passage_tok_1> <passage_tok_2> ... <passage_tok_n>[SEP]
without significant changes to accuracysample script to train DPR: