Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Premature end of training phase? #94

Closed
kyoungrok0517 opened this issue Aug 29, 2018 · 2 comments
Closed

Premature end of training phase? #94

kyoungrok0517 opened this issue Aug 29, 2018 · 2 comments

Comments

@kyoungrok0517
Copy link

kyoungrok0517 commented Aug 29, 2018

Hi, I'm trying to train my own Character LM model. The problem I'm experiencing is the training phase ends prematurely.

This is the last output I see. I think the end of split 10/200 in the output is not quite right.

image

This is the code I'm using

from flair.data import Dictionary
from flair.models import LanguageModel
from flair.trainers.language_model_trainer import LanguageModelTrainer, TextCorpus

# are you training a forward or backward LM?
is_forward_lm = True

# load the default character dictionary
dictionary: Dictionary = Dictionary.load('chars')

# get your corpus, process forward and at the character level
corpus = TextCorpus('../../data/TriviaQA/corpus/',
                    dictionary,
                    is_forward_lm,
                    character_level=True)

# instantiate your language model, set hidden size and number of layers
language_model = LanguageModel(dictionary,
                               is_forward_lm,
                               hidden_size=1024,
                               nlayers=1)

# train your language model
trainer = LanguageModelTrainer(language_model, corpus)

trainer.train('./language_model',
              sequence_length=250,
              mini_batch_size=100,
              max_epochs=10)

This is how my corpus is structured.

corpus/
image

train/
image

@kyoungrok0517 kyoungrok0517 changed the title Premature end of training phase Premature end of training phase? Aug 29, 2018
@iamyihwa
Copy link

iamyihwa commented Aug 29, 2018

@kyoungrok0517 I think the cause is due to the inconsistency of the terminology.
max_epochs=10, here epochs doesn't refer to the usual definition of epoch, which means one pass over the entire training set.
Therefore if you wanted 10 passes over your entire dataset, you should set max_epochs = 10 * (number of training examples).

Below is the comment from @alanakbik [on the previous issue I generated ]
(#89)

Yes, you are right that notation is inconsistent here, since the parameter "epochs" is used to count training data splits which is not intuitive. So until we fix this you are correct: use 100 * number_of_train_files as max_epochs if you want to do 100 epochs.

Generally, our advice is to set the max epochs to an extremely high number and run the training until the learning rate has annealed twice. The learning rate starts annealing when training yields few improvements, so when it has annealed a few times the model is as good as it can get. Also, we would recommend grouping the training files so that you have about 20-50 files so that you do not lose too much time validating at the end of each split, and set patience to perhaps half the number of your training splits!

@kyoungrok0517
Copy link
Author

I see. Thanks!

alanakbik pushed a commit that referenced this issue Sep 27, 2018
alanakbik pushed a commit that referenced this issue Sep 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants