You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure how to address this, but I am using the base PerceiverModel() (not pretrained) to train a bunch of IMU data and other dynamics, and modified the model for a regression task, such that it outputs via an nn.Linear with outputdim=3. I am also using cosine-based learning rate scheduler with warmup (found in another Github repo) that is updated at batch level, as well as nn_utils.clip_grad_norm_(reg_model.parameters(), max_norm=1.0) after the loss.backward() step, and before the optimizer.step().
I am seeing that the model's average RMSE loss on both training and validation sets gets to ~0.22-0.24 from the first 3 epochs, and after that it's small differences, eventually triggering early stopping. Please see attached graph. I have tried different batch sizes and tampered with the weight decay from the AdamW optimizer (but not extensively), yet it pretty much seems to converge on this loss level. I have even removed data that could be classified as outliers, as well as tried different train/test/val splits, and yet the losses remain in the same range.
Data have been normalized by loading all batches from the train_dataloader and obtaining their global min and max (this is intentional over standardization). On loading the batches in either train/val/test loops, I am using these global values to normalize the batch data before submitting them to the PerceiverRegressor. I have applied the same logic to the 3 output values, all of which therefore range between 0 and 1.
I would highly appreciate any help or insight on this! It must be something obvious that I am failing to notice perhaps, so your expert eyes could prove useful.
The text was updated successfully, but these errors were encountered:
Not sure how to address this, but I am using the base PerceiverModel() (not pretrained) to train a bunch of IMU data and other dynamics, and modified the model for a regression task, such that it outputs via an nn.Linear with outputdim=3. I am also using cosine-based learning rate scheduler with warmup (found in another Github repo) that is updated at batch level, as well as nn_utils.clip_grad_norm_(reg_model.parameters(), max_norm=1.0) after the loss.backward() step, and before the optimizer.step().
I am seeing that the model's average RMSE loss on both training and validation sets gets to ~0.22-0.24 from the first 3 epochs, and after that it's small differences, eventually triggering early stopping. Please see attached graph. I have tried different batch sizes and tampered with the weight decay from the AdamW optimizer (but not extensively), yet it pretty much seems to converge on this loss level. I have even removed data that could be classified as outliers, as well as tried different train/test/val splits, and yet the losses remain in the same range.
Data have been normalized by loading all batches from the train_dataloader and obtaining their global min and max (this is intentional over standardization). On loading the batches in either train/val/test loops, I am using these global values to normalize the batch data before submitting them to the PerceiverRegressor. I have applied the same logic to the 3 output values, all of which therefore range between 0 and 1.
I would highly appreciate any help or insight on this! It must be something obvious that I am failing to notice perhaps, so your expert eyes could prove useful.
The text was updated successfully, but these errors were encountered: