Major refactoring of Model classes (Step 2) #2351

alanakbik · 2021-07-25T21:03:22Z

As we are now adding many more types of models to Flair (see #2333) we are refactoring the logic of Flair model classes to remove many redundancies and make the code hopefully easier to read.

This PR is the second step in this refactoring process and mainly adds the new DefaultClassifier base class to Flair. It contains all logic that is shared by models that do classification, so that these models use the same (1) evaluation code, (2) prediction code and (3) loss computation code. Any Flair model that inherits from DefaultClassifier now only has to implement the forward_pass() method - all the rest is then supplied by the parent class.

With this refactoring, the TextClassifier, the RelationExtractor, the TextPairClassifier and the SimpleSequenceTagger now inherits from DefaultClassifier. Accordingly, much redundant code has disappeared. The SequenceTagger still inherits from Classifier, but this will change as a major refactoring here is planned as well.

In addition, the TARS classes (TARSClassifier and TARSTagger) were broadly refactored so thatt they inherit from the new base class FewshotClassifier and share most of the TARS logic through the super class. This removed many redundancies. The implementation of TARSClassifier is now slightly different from before as single-label predictions are no longer enforced.

In addition, many small changes were made:

The batch_size parameter in TransformerDocumentEmbeddings has been removed since it is very counterintuitive and caused speed losses if not used correctly. The default behavior is not to batch over all sentences in a given mini-batch.
You can now remove entries from a Dictionary with the method remove_item()
The sanity checks for setting scores on Labels have been removed
Potentially breaking: the make_label_dictionary() method of the Corpus now requires you to specify a label_type. To make this change easier, it now prints a list of all available label types when a dictionary is computed
The label_type in the GO_EMOTIONS dataset is renamed to 'emotion'
The TextPairClassifier now resides in its own module
The flair.nn module was split into a folder structure

alanakbik added 10 commits July 25, 2021 14:06

Big reorganization of code and TARS classes

9211ea3

Big reorganization of code and TARS classes

4e18c8a

Remove batch variable from TransformerDocumentEmbeddings

cb56402

Fix prediction in multi-label classification

7127fd0

Add some comments

a1eb444

Rename TARS labels

6f5a828

Add better comments

b76437c

Fix unit test

66595ec

Fix unit test

161140d

Fix multi-class probability

cd27889

alanakbik merged commit 86963a7 into master Jul 26, 2021

alanakbik deleted the unified_predict branch July 29, 2021 07:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major refactoring of Model classes (Step 2) #2351

Major refactoring of Model classes (Step 2) #2351

alanakbik commented Jul 25, 2021

Major refactoring of Model classes (Step 2) #2351

Major refactoring of Model classes (Step 2) #2351

Conversation

alanakbik commented Jul 25, 2021