Unable to Train Multilabel TextClassifer #2869

ianmcampbell · 2022-07-20T20:47:15Z

I have a complex multilabel classification task. The label space is approximately 13000 labels (specifically a subset of the Human Phenotype Ontology).

The task it to classify sentences with zero or more labels. About 5% of the corpus has zero labels, 75% has one label, and 20% has more than one label.

I installed flair from master today (20 July 2022).

If I train a single label model using TextClassifer, it works pretty well, with an ultimate F1 of about 0.8.

class FlairNormalizer(object):

    def __create_model(self):
        label_type="label"
        self.corpus: Corpus = ClassificationCorpus("resources/flair/",
                                                   train_file="train.tsv", dev_file="dev.tsv", test_file="test.tsv",
                                                   label_type=label_type,
                                                   allow_examples_without_labels=True)
        label_dict = self.corpus.make_label_dictionary(label_type=label_type)
        document_embeddings = TransformerDocumentEmbeddings("dmis-lab/biobert-base-cased-v1.2",
                                                            fine_tune=True)
        self.model = TextClassifier(document_embeddings,
                                    label_type=label_type,
                                    label_dictionary=label_dict,
                                    multi_label=False)

    def __train_model(self):
        trainer = ModelTrainer(self.model, self.corpus)
        trainer.train("model/flair/",
                  learning_rate=1e-2,
                  mini_batch_size=8,
                  max_epochs=50,
                  mini_batch_chunk_size=8,
                  num_workers=0,
                  train_with_dev=False,)

The training shows consistent improvements with early epochs.

2022-07-20 16:14:11,123 ----------------------------------------------------------------------------------------------------
2022-07-20 16:14:51,044 epoch 4 - iter 135/1352 - loss 0.36623756 - samples/sec: 109.55 - lr: 0.010000
...
2022-07-20 16:20:47,012 epoch 4 - iter 1350/1352 - loss 0.36107231 - samples/sec: 110.55 - lr: 0.010000
2022-07-20 16:20:47,442 ----------------------------------------------------------------------------------------------------
2022-07-20 16:20:47,443 EPOCH 4 done: loss 0.3612 - lr 0.010000
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 12.04it/s]
2022-07-20 16:20:48,365 Evaluating as a multi-label problem: True
2022-07-20 16:20:48,457 DEV : loss 0.2030181735754013 - f1-score (micro avg)  0.2652

However, if I change to a multi_label = True TextClassifier, I can't get the model to train at all.

        self.model = TextClassifier(document_embeddings,
                                    label_type=label_type,
                                    label_dictionary=label_dict,
                                    multi_label=True,
                                    multi_label_threshold=0.1,)

The model starts off predicting many thousands of labels per sentence, and then settles on predicting no labels per sentence.

2022-07-20 11:47:06,830 ----------------------------------------------------------------------------------------------------
2022-07-20 11:47:54,801 epoch 13 - iter 135/1357 - loss 0.00018447 - samples/sec: 90.88 - lr: 0.002500
...
2022-07-20 11:55:11,588 epoch 13 - iter 1350/1357 - loss 0.00018185 - samples/sec: 90.80 - lr: 0.002500
2022-07-20 11:55:13,888 ----------------------------------------------------------------------------------------------------
2022-07-20 11:55:13,889 EPOCH 13 done: loss 0.0002 - lr 0.002500
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [01:00<00:00,  5.51s/it]
2022-07-20 11:56:14,550 Evaluating as a multi-label problem: True
2022-07-20 11:56:14,614 DEV : loss 9.439588757231832e-05 - f1-score (micro avg)  0.0
2022-07-20 11:56:14,657 Epoch    13: reducing learning rate of group 0 to 1.2500e-03.
2022-07-20 11:56:14,657 BAD EPOCHS (no improvement): 4

I've tried many possible values of multi_label_threshold, but the result is ultimately the same.

The problem sounds very similar to that described in Issue #678.

I see that there was a commit that implemented BCEWithLogitsLoss(), but the issue persists.

I also read that user collinpu reported that they "used a large positive pos_weight vector to bias the model away from predicting all nulls". However, there doesn't seem to be any supported way to specify pos_weight.

Does anyone have any ideas on how to get multilabel training to work for TextClassifer?

The text was updated successfully, but these errors were encountered:

helpmefindaname · 2022-07-21T15:52:30Z

Hey @ianmcampbell,

if I understand the pos_weight correctly, there is no reason to choose the weight over the pos_weight in a multi label setting?
If so, you can set the weights when creating the Classifier and then overwrite the loss function after creating the text classifier:

self.model = TextClassifier(document_embeddings,
                                    label_type=label_type,
                                    label_dictionary=label_dict,
                                    multi_label=True,
                                     loss_weights={ "label1": 3, "label2": 4, .... })
self.model.loss_function = torch.nn.BCEWithLogitsLoss(pos_weight=self.model.loss_weights)

If that makes your model work, I think it will be apprechiated to creat a PR that changes the weight parameter

stale · 2022-11-23T00:57:42Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ianmcampbell added the question Further information is requested label Jul 20, 2022

stale bot added the wontfix This will not be worked on label Nov 23, 2022

stale bot closed this as completed Dec 24, 2022

None-Such mentioned this issue May 29, 2023

[Question]: How to Train a Multi-label Text Classifier? #3255

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to Train Multilabel TextClassifer #2869

Unable to Train Multilabel TextClassifer #2869

ianmcampbell commented Jul 20, 2022

helpmefindaname commented Jul 21, 2022

stale bot commented Nov 23, 2022

Unable to Train Multilabel TextClassifer #2869

Unable to Train Multilabel TextClassifer #2869

Comments

ianmcampbell commented Jul 20, 2022

helpmefindaname commented Jul 21, 2022

stale bot commented Nov 23, 2022