Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TREC Covid Reranking Task #95

Merged
merged 14 commits into from
Sep 29, 2020

Conversation

justinborromeo
Copy link
Contributor

Command:

python -um pygaggle.run.evaluate_trec_covid_ranker                  
            --method t5                                                
            --model castorini/monot5-base-med-msmarco
            --dataset data/trec_covid 
            --model-type t5-base
            --task trec-covid 
            --index-dir indexes/lucene-index-cord19-abstract-2020-07-16
            --batch-size 16
            --output-file runs/run.monot5.trec_covid.tsv

Qrels file: qrels-covid_d5_j4.5-5.txt
Input Runfile: expanded.anserini.final-r5fusion1.txt
Queries: topics-covid-round5.xml

trec_eval output: https://gist.github.com/justinborromeo/cb6cff577c926d47a78ba9170add5dd9

The scores shown in the trec_eval output seem a little bit low. I'll add documentation after this has been reviewed.

@@ -5,7 +5,7 @@
PreTrainedTokenizer,
T5ForConditionalGeneration)
import torch

import logging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

if i + seg_size >= len(sentences):
end_idx += i/stride + 1
doc_end_indexes.append(int(end_idx))
break
return SegmentGroup(segmented_doc, doc_end_indexes)

def aggregate(self, documents: List[Text], segments_group: SegmentGroup, method: str = "max") -> List[Text]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a short docstring to each method and one for the file describing what it does in general?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one for the classes. Is that good enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good to have a docstring for each method, even if it is only one or two sentences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added for the methods and classes. I guess my question should have been whether I should add a docstring for the file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. Ideally, there should be a docstring for the file, but I think it is fine docstrings for methods and classes only.


def main():
apb = ArgumentParserBuilder()
apb.add_opts(opt('--task',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add descriptions for the parameters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, do you mean setting the help string for each or just describing the parameters as comments in the code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for each parameter

@rodrigonogueira4
Copy link
Member

Hey @justinborromeo, let us know when this is ready for another review?

@justinborromeo
Copy link
Contributor Author

Ready for review!

Copy link
Member

@rodrigonogueira4 rodrigonogueira4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for the hard work!

@rodrigonogueira4
Copy link
Member

@justinborromeo do you have merge permissions?

@justinborromeo
Copy link
Contributor Author

@justinborromeo do you have merge permissions?

No :(

@rodrigonogueira4 rodrigonogueira4 merged commit 6638f6e into castorini:master Sep 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants