Micro average precision and recall not listed for non-span sequence labelers #1934

alanakbik · 2020-11-02T21:01:54Z

The current implementation of the SequenceLabeler evaluation evaluates non-span sequence labeling such that a prediction is made for each word. This is correct in many cases like POS tagging where each word must have one POS tag.

However, in some cases - like word or frame disambiguation - only some words have a predicted sense tag. Many words will have no prediction. Our evaluation currently counts "no prediction" as a class and so no micro average precision and recall are computed.

To reproduce:

# load English universal proposition bank
corpus = UP_ENGLISH().downsample(0.001)

# make tag dictionary
tag_dictionary = corpus.make_tag_dictionary('frame')

# init simple tagger
tagger: SequenceTagger = SequenceTagger(
    hidden_size=256,
    embeddings=WordEmbeddings('glove'),
    tag_dictionary=tag_dictionary,
    tag_type='frame',
    use_crf=False, # there are too many classes for CRF
)

# train model
trainer = ModelTrainer(tagger, corpus)

trainer.train(f'resources/taggers/frame-test-output',
              max_epochs=50,
              mini_batch_size=8,
              )

The output then does not list micro-averaged precision and recall.

(Note that this only affects non-span sequence labeling, so NER for instance is not affected.)

The text was updated successfully, but these errors were encountered:

GH-1934: add handling for micro-average precision and recall

alanakbik added a commit that referenced this issue Nov 2, 2020

GH-1934: add handling for micro-average precision and recall

1b79964

alanakbik mentioned this issue Nov 2, 2020

GH-1934: add handling for micro-average precision and recall #1935

Merged

alanakbik added a commit that referenced this issue Nov 3, 2020

GH-1934: fix unit test

18ce79c

alanakbik added a commit that referenced this issue Nov 3, 2020

GH-1934: handle empty label lists

229ed32

alanakbik closed this as completed in #1935 Nov 3, 2020

alanakbik added a commit that referenced this issue Nov 3, 2020

Merge pull request #1935 from flairNLP/GH-1934-micro-average

0ecdbd8

GH-1934: add handling for micro-average precision and recall

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Micro average precision and recall not listed for non-span sequence labelers #1934

Micro average precision and recall not listed for non-span sequence labelers #1934

alanakbik commented Nov 2, 2020

Micro average precision and recall not listed for non-span sequence labelers #1934

Micro average precision and recall not listed for non-span sequence labelers #1934

Comments

alanakbik commented Nov 2, 2020