Premature end of training phase? #94

kyoungrok0517 · 2018-08-29T04:33:44Z

Hi, I'm trying to train my own Character LM model. The problem I'm experiencing is the training phase ends prematurely.

This is the last output I see. I think the end of split 10/200 in the output is not quite right.

This is the code I'm using

from flair.data import Dictionary
from flair.models import LanguageModel
from flair.trainers.language_model_trainer import LanguageModelTrainer, TextCorpus

# are you training a forward or backward LM?
is_forward_lm = True

# load the default character dictionary
dictionary: Dictionary = Dictionary.load('chars')

# get your corpus, process forward and at the character level
corpus = TextCorpus('../../data/TriviaQA/corpus/',
                    dictionary,
                    is_forward_lm,
                    character_level=True)

# instantiate your language model, set hidden size and number of layers
language_model = LanguageModel(dictionary,
                               is_forward_lm,
                               hidden_size=1024,
                               nlayers=1)

# train your language model
trainer = LanguageModelTrainer(language_model, corpus)

trainer.train('./language_model',
              sequence_length=250,
              mini_batch_size=100,
              max_epochs=10)

This is how my corpus is structured.

corpus/

train/

The text was updated successfully, but these errors were encountered:

iamyihwa · 2018-08-29T07:32:44Z

@kyoungrok0517 I think the cause is due to the inconsistency of the terminology.
max_epochs=10, here epochs doesn't refer to the usual definition of epoch, which means one pass over the entire training set.
Therefore if you wanted 10 passes over your entire dataset, you should set max_epochs = 10 * (number of training examples).

Below is the comment from @alanakbik [on the previous issue I generated ]
(#89)

Yes, you are right that notation is inconsistent here, since the parameter "epochs" is used to count training data splits which is not intuitive. So until we fix this you are correct: use 100 * number_of_train_files as max_epochs if you want to do 100 epochs.

Generally, our advice is to set the max epochs to an extremely high number and run the training until the learning rate has annealed twice. The learning rate starts annealing when training yields few improvements, so when it has annealed a few times the model is as good as it can get. Also, we would recommend grouping the training files so that you have about 20-50 files so that you do not lose too much time validating at the end of each split, and set patience to perhaps half the number of your training splits!

kyoungrok0517 · 2018-08-29T07:58:26Z

I see. Thanks!

kyoungrok0517 changed the title ~~Premature end of training phase~~ Premature end of training phase? Aug 29, 2018

kyoungrok0517 closed this as completed Aug 29, 2018

alanakbik pushed a commit that referenced this issue Sep 27, 2018

GH-94: new default patience value

9a87da0

alanakbik pushed a commit that referenced this issue Sep 27, 2018

GH-94: new default patience value

77048b3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Premature end of training phase? #94

Premature end of training phase? #94

kyoungrok0517 commented Aug 29, 2018 •

edited

Loading

iamyihwa commented Aug 29, 2018 •

edited

Loading

kyoungrok0517 commented Aug 29, 2018

Premature end of training phase? #94

Premature end of training phase? #94

Comments

kyoungrok0517 commented Aug 29, 2018 • edited Loading

iamyihwa commented Aug 29, 2018 • edited Loading

kyoungrok0517 commented Aug 29, 2018

kyoungrok0517 commented Aug 29, 2018 •

edited

Loading

iamyihwa commented Aug 29, 2018 •

edited

Loading