-
Notifications
You must be signed in to change notification settings - Fork 175
Adds the pretrained
module from allennlp-hub.
#42
Conversation
Our organization of pre-trained models is still a total mess. We have three places now where we have the URLs. Here, and twice in allennlp-models. They don't all agree, and they don't all have the same models. We will have to fix this soon. It's confusing. |
Looks like the tests you wrote aren't found by pytest. I think by default pytest assumes test classes are named |
Another thing to consider is that we'll be re-downloading these models on each CI run, which will run up our cloud bill. If we used a self-hosted runner for the pretrained tests we wouldn't have this problem since the models would be cached locally. |
return predictor | ||
|
||
|
||
def named_entity_recognition_with_elmo_peters_2018() -> SentenceTaggerPredictor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll need to import the crf tagger module for this to work
tests/pretrained_test.py
Outdated
# But default we don't run these tests | ||
@pytest.mark.skipif( | ||
not os.environ.get("ALLENNLP_MODELS_RUN_PRETRAINED_TEST"), reason="requires massive downloads" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GPU tests are selected in a different way, aren't they? Why isn't this done the same way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yea, we should probably run those on the self-hosted runner as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - I realize your question now. I felt like there should be an explicit skip condition here because if you were running tests locally you might not want to download these massive files, and I didn't know what that condition could be other than an environment variable flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we could also add a custom mark like we do for GPU tests and then exclude those by default. That might be cleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we're inconsistent about how we handle GPU tests. Some have @pytest.mark.gpu
, others use skipif
and detect how many GPUs there are. Others yet have both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The skipif
ensures they're never accidentally run if you don't have a GPU available. The mark.gpu
is just so that we can select only GPU tests to run
I think this is good to go then? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to go. I'll follow up with a PR to add a GPU testing workflow
These tests might not succeed because of the disk space issue.
Finishes allenai/allennlp#3302.
Closes allenai/allennlp#3965.