Format of input gold_label_dictionary for dependency parser #2575

FredericBlum · 2021-12-28T16:30:16Z

Hello,
I am currently trying to implement the DependencyParser for a corpus in the conllu format. It runs smoothly until it hits the evaluation function, where I receive the following error:
TypeError: unsupported format string passed to Tensor.__format__

This is both with leaving the gold_label_dictionary empty (is marked as "optional" in the class), or with feeding it a label-dictionary. What needs to be my input in order for the parser to run?

I feel also a little bit unsecure regarding the format of the input corpus. Did I understand correctly that the parser only takes the token and the deprel-feature as input, leaving aside upos and "head"?

Looking forward to your help and exploring more about the dependency parser, thanks for implementing it!

The text was updated successfully, but these errors were encountered:

alanakbik · 2021-12-28T17:29:31Z

@Tarotis can you post your training script?

FredericBlum · 2021-12-28T18:11:33Z

corpus, gold_dict = conllu_to_flair('converted.conllu') 
# creates a CorpusColumn corpus with 0:form, 1:upos, 2:head, 3:deprel, filtering out multi-word tokens

label_type = 'deprel'
dependency_dictionary = corpus.make_label_dictionary(label_type=label_type)
flair_embedding_forward = FlairEmbeddings('models/resources/embeddings/sk_forward/best-lm.pt')
flair_embedding_backward = FlairEmbeddings('models/resources/embeddings/sk_backward/best-lm.pt')
embeddings = StackedEmbeddings(embeddings=[flair_embedding_forward, flair_embedding_backward])

tagger = DependencyParser(lstm_hidden_size = 512,
                        token_embeddings=embeddings,
                        relations_dictionary=dependency_dictionary,
                        tag_type=label_type)

trainer = ModelTrainer(tagger, corpus)

trainer.train('models/resources/taggers/example-dependency',
                use_final_model_for_eval = True,
                learning_rate=0.1,
                mini_batch_size=8,
                max_epochs=20)

Some outputs from training:

2021-12-28 18:24:30,040 Corpus contains the labels: upos (#3716), head (#3716), deprel (#3716)
2021-12-28 18:59:06,761 Created (for label 'deprel') Dictionary with 31 tags: <unk>, nsubj, cop, root, punct, case, obj, aux:val, advmod, aux, nmod, amod, cc, obl, conj, xcomp, det, advcl, Lfcl, compound, nummod, x, ccomp, appos, iobj, discourse, vocative, acl, flat, marker, parataxis

Detailed error:

gold_label_dictionary=gold_label_dictionary_for_eval,
  File "anonymized/.environments/nlp/lib/python3.6/site-packages/flair/models/dependency_parser_model.py", line 334, in evaluate
    f"\nUAS : {parsing_metric.get_uas():.4f} - LAS : {parsing_metric.get_las():.4f}"
  File "anonymized/.environments/nlp/lib/python3.6/site-packages/torch/_tensor.py", line 572, in __format__
    return object.__format__(self, format_spec)
TypeError: unsupported format string passed to Tensor.__format__

alanakbik · 2021-12-29T16:37:41Z

Can you try using this corpus instead:

corpus = UD_ENGLISH()

dictionary = corpus.make_label_dictionary("dependency")

Does it work then?

FredericBlum · 2021-12-29T16:42:53Z

No, I receive the same error as with my own data.

alanakbik · 2021-12-29T16:46:58Z

I just tested this script on current master branch and it runs:

from flair.datasets import UD_ENGLISH
from flair.embeddings import StackedEmbeddings, FlairEmbeddings
from flair.models import DependencyParser
from flair.trainers import ModelTrainer

corpus = UD_ENGLISH()

dependency_dictionary = corpus.make_label_dictionary("dependency")

embeddings = StackedEmbeddings(embeddings=[FlairEmbeddings('news-forward-fast'),
                                           FlairEmbeddings('news-backward-fast')])

tagger = DependencyParser(lstm_hidden_size=512,
                          token_embeddings=embeddings,
                          relations_dictionary=dependency_dictionary,
                          tag_type="dependency")

trainer = ModelTrainer(tagger, corpus)

trainer.train('models/resources/taggers/example-dependency',
              use_final_model_for_eval=True,
              learning_rate=0.1,
              mini_batch_size=8,
              max_epochs=20,
              )

alanakbik · 2021-12-29T17:21:30Z

Ah wait, I get this error during the evaluation. I'll check.

FredericBlum · 2021-12-29T17:21:41Z

First I did a reinstall, but neither old nor new scripts worked.
Then I commented out the two following lines (334, 335):

            f"\nUAS : {parsing_metric.get_uas():.4f} - LAS : {parsing_metric.get_las():.4f}"
            f"\neval loss rel : {eval_loss_rel:.4f} - eval loss arc : {eval_loss_arc:.4f}"

Now all the models run smoothly and the predictions work as well. I still think there could be a bug within those functions, but I wouldn't know why it appears only on my side.

FredericBlum added the question Further information is requested label Dec 28, 2021

alanakbik added a commit that referenced this issue Dec 30, 2021

GH-2575: fix loss printing error

b6b53d6

alanakbik mentioned this issue Dec 30, 2021

Dependency parser experiments #2579

Merged

alanakbik closed this as completed in #2579 Dec 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Format of input gold_label_dictionary for dependency parser #2575

Format of input gold_label_dictionary for dependency parser #2575

FredericBlum commented Dec 28, 2021 •

edited

Loading

alanakbik commented Dec 28, 2021

FredericBlum commented Dec 28, 2021

alanakbik commented Dec 29, 2021

FredericBlum commented Dec 29, 2021

alanakbik commented Dec 29, 2021

alanakbik commented Dec 29, 2021

FredericBlum commented Dec 29, 2021

Format of input gold_label_dictionary for dependency parser #2575

Format of input gold_label_dictionary for dependency parser #2575

Comments

FredericBlum commented Dec 28, 2021 • edited Loading

alanakbik commented Dec 28, 2021

FredericBlum commented Dec 28, 2021

alanakbik commented Dec 29, 2021

FredericBlum commented Dec 29, 2021

alanakbik commented Dec 29, 2021

alanakbik commented Dec 29, 2021

FredericBlum commented Dec 29, 2021

FredericBlum commented Dec 28, 2021 •

edited

Loading