Add Fine-Tunable Transformers to Flair #1492

alanakbik · 2020-03-25T20:02:52Z

We currently support word embeddings from Huggingface's various transformer models (BERT, XLM, etc.), but two important features are missing: (1) we don't yet support sentence embeddings extracted directly from the transformer model using the [CLS] token and (2) the transformers currently are not fine-tuneable via Flair. This is a shame since transformers really shine when sentence embeddings are directly extracted from a fine-tuned transformer.

So with this issue, we want to add

The ability to get sentence embeddings directly from transformers, by adding new DocumentEmbeddings classes
The ability to fine-tune all transformer word and document embeddings classes

…ifferent CLS tokens

djstrong · 2020-03-29T23:07:41Z

Supporting longer texts (more than 512 subtokens) would be helpful (at least for prediction). My research show that processing paragraphs rather than sentences decreases error by 10%.

alanakbik · 2020-03-29T23:12:43Z

Yes good point - what is the 'standard' way of working around the 512 subtoken limitation of transformers? I guess easiest would be to truncate the text to max length 512, but maybe there is a better way?

djstrong · 2020-03-29T23:17:31Z

I have in mind sequence tagging so truncating in prediction mode is unacceptable. The text should be divided into splits with some overlapping context and then reconstructed.

For text classification there are some truncating strategies. However, in simple-transformers text is divided and each part is predicted separately, then the mode of text predictions is a final result.

alanakbik · 2020-03-29T23:30:58Z

Thanks - yes for TransformerWordEmbeddings an overlapping segment strategy should be doable and sounds like the best approach. For TransformerDocumentEmbeddings we require a strategy that outputs a single embedding for a text of arbitrary length so maybe truncation is the way to go here.

into GH-1492-transformers

alanakbik · 2020-04-02T17:11:07Z

Just for reference, some truncation strategies are evaluated in this paper.

GH-1492: added new BERT embeddings implementation

alanakbik · 2020-05-24T13:08:45Z

Fine-tuning now part of Flair 0.5.

alanakbik added the feature A new feature label Mar 25, 2020

kishaloyhalder pushed a commit that referenced this issue Mar 26, 2020

GH-1492: added new BERT embeddings implementation

5a081a2

kishaloyhalder pushed a commit that referenced this issue Mar 26, 2020

GH-1492: added new BERT embeddings implementation, with original name

4a9e629

alanakbik added a commit that referenced this issue Mar 28, 2020

GH-1492: suggestion for unified TransformerWordEmbeddings class

5371ed2

alanakbik added a commit that referenced this issue Mar 28, 2020

GH-1492: suggestion for unified TransformerWordEmbeddings class

4875585

alanakbik added a commit that referenced this issue Mar 29, 2020

GH-1492: suggestion for unified TransformerWordEmbeddings class

44e0fc0

alanakbik added a commit that referenced this issue Mar 29, 2020

GH-1492: add fine-tuneable transformer embeddings for documents

244d4ff

alanakbik added a commit that referenced this issue Mar 29, 2020

GH-1492: add mirco-batching to transformer embeddings | handling of d…

5f118a7

…ifferent CLS tokens

alanakbik added a commit that referenced this issue Mar 30, 2020

GH-1492: truncation to 512 subtokens in TransformerDocumentEmbeddings

03ed8c5

alanakbik added a commit that referenced this issue Mar 30, 2020

GH-1492: truncation to 512 subtokens in TransformerDocumentEmbeddings

cad2e8a

alanakbik mentioned this issue Apr 1, 2020

Add better sentiment analysis model #1503

Closed

3 tasks

alanakbik added a commit that referenced this issue Apr 1, 2020

Merge branch 'GH-1492-transformers' of https://github.com/flairNLP/flair

c4772a7

into GH-1492-transformers

alanakbik added a commit that referenced this issue Apr 1, 2020

GH-1492: clean up code

745cc4f

alanakbik added a commit that referenced this issue Apr 1, 2020

GH-1492: clean up code

691fc4a

alanakbik added a commit that referenced this issue Apr 2, 2020

GH-1492: revert BertEmbeddings for future deprecation

181ce16

alanakbik added a commit that referenced this issue Apr 3, 2020

Merge pull request #1494 from flairNLP/GH-1492-transformers

e9b5c2a

GH-1492: added new BERT embeddings implementation

This was referenced Apr 29, 2020

Significance of reproject_words in DocumentRNNEmbeddings #691

Closed

Consider adding fine tuning BERT for text classification and sequence tagging? #470

Closed

alanakbik closed this as completed May 24, 2020

ilya-palachev mentioned this issue Jul 21, 2020

Why SentenceTransformerDocumentEmbeddings are not fine-tunable? #1769

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Fine-Tunable Transformers to Flair #1492

Add Fine-Tunable Transformers to Flair #1492

alanakbik commented Mar 25, 2020 •

edited

Loading

djstrong commented Mar 29, 2020

alanakbik commented Mar 29, 2020

djstrong commented Mar 29, 2020 •

edited

Loading

alanakbik commented Mar 29, 2020

alanakbik commented Apr 2, 2020

alanakbik commented May 24, 2020

Add Fine-Tunable Transformers to Flair #1492

Add Fine-Tunable Transformers to Flair #1492

Comments

alanakbik commented Mar 25, 2020 • edited Loading

djstrong commented Mar 29, 2020

alanakbik commented Mar 29, 2020

djstrong commented Mar 29, 2020 • edited Loading

alanakbik commented Mar 29, 2020

alanakbik commented Apr 2, 2020

alanakbik commented May 24, 2020

alanakbik commented Mar 25, 2020 •

edited

Loading

djstrong commented Mar 29, 2020 •

edited

Loading