GH-921: fine-tune FlairEmbeddings #922

alanakbik · 2019-07-22T12:59:45Z

Closes #921

This PR makes FlairEmbeddings task-trainable. This allows users to (a) fine-tune an existing language model on task data and (b) train a new model only on task data.

You can fine-tune an existing LM by simply passing the fine_tune parameter in the FlairEmbeddings constructor, like this:

embeddings = FlairEmbeddings('news-foward', fine_tune=True)

You can task-train a wholly new language model by passing an empty LanguageModel to the FlairEmbeddings constructor and the fine_tune parameter, like this:

# make an empty language model
language_model = LanguageModel(
    Dictionary.load('chars'),
    is_forward_lm=True,
    hidden_size=256,
    nlayers=1)

# init FlairEmbeddings to task-train this model
embeddings = FlairEmbeddings(language_model, fine_tune=True)

Also closes AttributeError: 'WordEmbeddings' object has no attribute 'embeddings' #880 by fixing the embedding printout.
This PR also removes the FlairEmbeddings-specific disk-caching mechanism. In the future, a more general caching mechanism applicable to all embedding types should be added on a different level of logic, potentially as a feature in the ModelTrainer.

yosipk · 2019-07-22T13:53:40Z

👍

alanakbik · 2019-07-22T15:14:03Z

👍

jasminesjlee · 2019-07-30T14:52:44Z

For clarification -- what exactly does it mean to "task-train a wholly new language model"? Is this different from training a LM on this data from scratch?

alanakbik · 2019-07-30T15:09:33Z

Hello @jasminesjlee yes this would be different since the training objective is different. A standard language model is trained to predict the next character given the previous characters. However, what we do here is wrap it into FlairEmbeddings and train it on a downstream task (such as NER). So we train a character-level RNN to produce useful features specifically for the downstream task (see the paper by Liu et al. who originally proposed this). Due to the different objective, such an LM cannot be used to generate text.

jasminesjlee · 2019-07-30T15:38:22Z

Ah, I understand. Thank you @alanakbik ! :)

DecentMakeover · 2019-08-23T08:40:11Z

@alanakbik What branch should i be in to access these changes i am currently on

      origin/GH-538-attention
  origin/GH-822-xlnet
  origin/GH-873-pytorch-transformers
  origin/HEAD -> remotes/origin/master
  origin/SimilarityLearner
  origin/master
  origin/release-0.4.1

Thanks

pommedeterresautee · 2019-08-23T08:45:31Z

As this PR is merged it is available on master

DecentMakeover · 2019-08-23T09:02:16Z

@pommedeterresautee Thanks .

aakbik added 2 commits July 22, 2019 14:47

GH-921: make FlairEmbeddings fine-tuneable

e668150

GH-921: add unit test

86b1fdc

alanakbik changed the title ~~Gh 921 fune tune flair~~ GH-921: fine-tune FlairEmbeddings Jul 22, 2019

GH-921: remove caching tests

ea07cff

alanakbik merged commit 9f49d09 into master Jul 22, 2019

alanakbik deleted the GH-921-fune-tune-flair branch July 22, 2019 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-921: fine-tune FlairEmbeddings #922

GH-921: fine-tune FlairEmbeddings #922

alanakbik commented Jul 22, 2019 •

edited

Loading

yosipk commented Jul 22, 2019

alanakbik commented Jul 22, 2019

jasminesjlee commented Jul 30, 2019

alanakbik commented Jul 30, 2019

jasminesjlee commented Jul 30, 2019

DecentMakeover commented Aug 23, 2019

pommedeterresautee commented Aug 23, 2019

DecentMakeover commented Aug 23, 2019

GH-921: fine-tune FlairEmbeddings #922

GH-921: fine-tune FlairEmbeddings #922

Conversation

alanakbik commented Jul 22, 2019 • edited Loading

yosipk commented Jul 22, 2019

alanakbik commented Jul 22, 2019

jasminesjlee commented Jul 30, 2019

alanakbik commented Jul 30, 2019

jasminesjlee commented Jul 30, 2019

DecentMakeover commented Aug 23, 2019

pommedeterresautee commented Aug 23, 2019

DecentMakeover commented Aug 23, 2019

alanakbik commented Jul 22, 2019 •

edited

Loading