Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major refactoring of Model classes (Step 2) #2351

Merged
merged 10 commits into from
Jul 26, 2021
Merged

Major refactoring of Model classes (Step 2) #2351

merged 10 commits into from
Jul 26, 2021

Conversation

alanakbik
Copy link
Collaborator

As we are now adding many more types of models to Flair (see #2333) we are refactoring the logic of Flair model classes to remove many redundancies and make the code hopefully easier to read.

This PR is the second step in this refactoring process and mainly adds the new DefaultClassifier base class to Flair. It contains all logic that is shared by models that do classification, so that these models use the same (1) evaluation code, (2) prediction code and (3) loss computation code. Any Flair model that inherits from DefaultClassifier now only has to implement the forward_pass() method - all the rest is then supplied by the parent class.

With this refactoring, the TextClassifier, the RelationExtractor, the TextPairClassifier and the SimpleSequenceTagger now inherits from DefaultClassifier. Accordingly, much redundant code has disappeared. The SequenceTagger still inherits from Classifier, but this will change as a major refactoring here is planned as well.

In addition, the TARS classes (TARSClassifier and TARSTagger) were broadly refactored so thatt they inherit from the new base class FewshotClassifier and share most of the TARS logic through the super class. This removed many redundancies. The implementation of TARSClassifier is now slightly different from before as single-label predictions are no longer enforced.

In addition, many small changes were made:

  • The batch_size parameter in TransformerDocumentEmbeddings has been removed since it is very counterintuitive and caused speed losses if not used correctly. The default behavior is not to batch over all sentences in a given mini-batch.
  • You can now remove entries from a Dictionary with the method remove_item()
  • The sanity checks for setting scores on Labels have been removed
  • Potentially breaking: the make_label_dictionary() method of the Corpus now requires you to specify a label_type. To make this change easier, it now prints a list of all available label types when a dictionary is computed
  • The label_type in the GO_EMOTIONS dataset is renamed to 'emotion'
  • The TextPairClassifier now resides in its own module
  • The flair.nn module was split into a folder structure

@alanakbik alanakbik merged commit 86963a7 into master Jul 26, 2021
@alanakbik alanakbik deleted the unified_predict branch July 29, 2021 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant