inconsistency in the notation / term epoch #89

iamyihwa · 2018-08-23T09:44:17Z

Hello,
I am training language model,
Before testing with whole text, I was running it with smaller text.

From my understanding epoch vs. batch vs. minibatch is: (From a post in stackexchange)

However when train the language model with following parameters,

# train your language model
trainer = LanguageModelTrainer(language_model, corpus)

trainer.train('resources/taggers/language_model_es_forward',
              sequence_length=250,
              mini_batch_size=100,
              max_epochs=5)

I get following output.
Currently there are 51 input files. ( I see that the mini_batch_size is larger than the number of the training files. So this might be a problem)
However, in the output, the epoch number doesn't change .. only it says end of split (1/ 51) and it finishes after (5 / 51) ..
I wonder if it is due to different use of terminology?
Or in this case it only goes through 5 files and stops ?
If I want to go through my entire dataset 100 times for example, I have to do 100 * number_of_train_files as max_epochs?

(The Screenshot might be confusing, but the training finished after 5 / 51.. )

The original dataset that I have (wikidump es) are currently in 2300 files (each file about 1MB).

I am intending to put about 5 files together and make it into validation set, and put about 5 files together and make it into test set.

The rest of the files (about 2290 files), I will use to train the model.

If I want to pass multiple times what value should I use for epochs?

What are the good number of passes of data, when you trained your language model for English and German models?

The text was updated successfully, but these errors were encountered:

alanakbik · 2018-08-23T14:19:39Z

Yes, you are right that notation is inconsistent here, since the parameter "epochs" is used to count training data splits which is not intuitive. So until we fix this you are correct: use 100 * number_of_train_files as max_epochs if you want to do 100 epochs.

Generally, our advice is to set the max epochs to an extremely high number and run the training until the learning rate has annealed twice. The learning rate starts annealing when training yields few improvements, so when it has annealed a few times the model is as good as it can get. Also, we would recommend grouping the training files so that you have about 20-50 files so that you do not lose too much time validating at the end of each split, and set patience to perhaps half the number of your training splits!

iamyihwa · 2018-08-23T14:50:52Z

Thanks @alanakbik for the clarification.
I will do as you suggest!

At the moment I am seeing the loss not decreasing so quick, and ppl also got stuck somehow ..
Is there anything I should be doing?

iamyihwa · 2018-08-27T07:22:20Z

Have ran even further .. but ppl and loss seems to get stuck .. This happened after 1.3 epochs or so ... )
loss: 1.22, ppl : 3.37

alanakbik · 2018-08-27T15:00:02Z

Hi, yes it looks like the learning rate has annealed too quickly. The learning rate is 0.00 in the output.

This happens because your training splits are too small giving the learning too many opportunities to anneal. Either increase the size of your training splits or increase the patience. Or even better: both :)

Try:

trainer.train('resources/taggers/language_model_es_forward',
              sequence_length=250,
              mini_batch_size=100,
              max_epochs=2000, 
              patience=100)

iamyihwa · 2018-08-28T05:59:31Z

@alanakbik Yes I have tried it with the larger data size, (also increased the hidden neuron numbers, just in case..) and it already seems better!

Thanks @alanakbik for the suggestion, I will try with patience = 100!

BTW, does the language model require specific input ? I have used each one sentence input. In addition to that do I need to separate each token or do any normalization?

alanakbik · 2018-10-11T13:04:09Z

Term/epoch notation fixed in release-0.3.

iamyihwa mentioned this issue Aug 29, 2018

Premature end of training phase? #94

Closed

tabergma added the enhancement Improving of an existing feature label Oct 1, 2018

tabergma added language model Related to language model release-0.3 labels Oct 11, 2018

alanakbik pushed a commit that referenced this issue Oct 11, 2018

GH-89: consistent notation for epoch/split

362dec0

alanakbik pushed a commit that referenced this issue Oct 11, 2018

GH-89: log epoch and split count

a5f7e55

alanakbik closed this as completed Oct 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inconsistency in the notation / term epoch #89

inconsistency in the notation / term epoch #89

iamyihwa commented Aug 23, 2018 •

edited

Loading

alanakbik commented Aug 23, 2018

iamyihwa commented Aug 23, 2018

iamyihwa commented Aug 27, 2018

alanakbik commented Aug 27, 2018

iamyihwa commented Aug 28, 2018

alanakbik commented Oct 11, 2018

inconsistency in the notation / term epoch #89

inconsistency in the notation / term epoch #89

Comments

iamyihwa commented Aug 23, 2018 • edited Loading

alanakbik commented Aug 23, 2018

iamyihwa commented Aug 23, 2018

iamyihwa commented Aug 27, 2018

alanakbik commented Aug 27, 2018

iamyihwa commented Aug 28, 2018

alanakbik commented Oct 11, 2018

iamyihwa commented Aug 23, 2018 •

edited

Loading