-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On the relationship between Result and Callback monitor #3286
Comments
I wonder how the |
How it currently works, these callbacks cannot know when After my proposed changes, we could choose between:
|
sounds like a new |
May I also suggest to keep the checkpoint strategies separate from the model? I think what @carmocca is suggesting goes towards that direction. I think we should compute metrics within the model, but we should let the |
@williamFalcon thoughts? |
In my opinion, would it be better to use And we would use the callbacks as before with the |
ok guys, tracking this. sorry for the delay. focused on refactors today but will propose something tomorrow am (NY time). A lot of good ideas here that i need to think about overnight :) |
Adding a use case here which hopefully fits into the discussion. It was clear how to achieve it in v0.8.4, now it's not anymore: I want to use the validation accuracy on the whole validation set for checkpointing. Since I have a different number of samples in each batch, I can't just compute the accuracy in each step and average on epoch end. So I compute the number of correct and total samples in each step, log both to As mentioned in one of the comments above, the |
Hi all |
thanks for the patience! almost done with the first pass at refactors, so i will likely tackle this over the weekend. Here's what i'm thinking:
result = EvalResult()
result.log('giraffe', acc)
# then you need to monitor giraffe
es= EarlyStopping(monitor='giraffe') When you log at both step and epoch, your metrics adds 'step_' or '_epoch' (this is true today). result.log('giraffe', x, on_step=True, on_epoch=True)
# results in 'step_giraffe', 'epoch_giraffe' Since without that, the value is ambiguous.... Now, if you want to monitor this metric you can pick which one you want: es= EarlyStopping(monitor='step_giraffe')
es= EarlyStopping(monitor='epoch_giraffe') How does this sound to everyone? |
this won't allow changing the value you want to monitor on, right? I thought |
? yeah, you can still change it... you just have to specify it now in the callback. I personally think it's cleaner to set early_stop_on and checkpoint_on... this proposed new approach seems a bit more annoying to me but it's something i think everyone is asking for? |
It still can have Also, I think If you go with this new flag A healthy discussion is required here to cover-up all the possible use-cases. |
Mind explaining why do you find the proposed approach more annoying? It uses the already existing parameter
what about |
because not everyone has a "val_loss" keyword... because if i change my mind tomorrow, now i have to remember to change it in multiple places... and it also means that the trainer is no longer model agnostic... instead i have to if statement for every model i want to train to pick a different monitor value. it is very inconvenient lol. Example:Without keyword: # model A
result.log('X', x)
# model B
result.log('Y', y)
if model A:
es = EarlyStopping(monitor='X')
else:
es = EarlyStopping(monitor='Y') With keyword: # model A
EvalResult(early_stop_on=X)
# model B
EvalResult(early_stop_on=Y) Example 2...What if i want to change the key for early stopping during training? if epoch % 2 == 0:
EvalResult(checkpoint_on=X)
else:
EvalResult(checkpoint_on=Y) But without the keyword?? there's no way to do this... |
What if you don't know the checkpoint/early_stop metric in the LightningModule? you are forced to do this: # user_script.py
module = MyLightningModule(monitor=args.monitor)
# my_lightning_module.py
def training_step(...):
...
if self.monitor == "train_loss"
result = pl.TrainResult(minimize=train_loss, early_stop_on=train_loss)
elif self.monitor == "train_acc":
result = pl.TrainResult(minimize=train_loss, early_stop_on=train_acc)
elif:
...
else:
result = pl.TrainResult(minimize=train_loss)
result.log("train_loss", train_loss)
result.log("train_acc", train_acc)
...
def validation_step(...):
....
if self.monitor == "val_loss":
result = pl.EvalResult(early_stop_on=val_loss)
elif self.monitor == "val_acc":
result = pl.EvalResult(early_stop_on=val_acc)
elif:
...
else:
result = pl.EvalResult()
result.log("val_loss", val_loss)
result.log("val_acc", val_acc)
... The number of if statements would get much worse if I wanted to separate Alternatively: # user_script.py
module = LightningModule()
es = EarlyStopping(monitor=args.monitor)
# my_lightning_module.py
def training_step(...):
...
result = pl.TrainResult(minimize=train_loss)
result.log("train_loss", train_loss, callback=True)
result.log("train_acc", train_acc, callback=True)
...
def validation_step(...):
...
result = pl.EvalResult()
result.log("val_loss", val_loss, callback=True)
result.log("val_acc", val_acc, callback=True)
...
But I believe this is the most common case. You often want to change the monitor without having to modify the module code. It is also how other libraries (e.g. Keras) do it afaik
This is why I proposed letting monitor be a Callable, see original issue comment |
but this is exactly why the results makes more sense... because you CAN change it while training. This is an exact example of that. Otherwise you'd have to set the monitor at the callback level and that's now disconnected from the model... ie: your model can't be moved around and is not modular now because it has this other external dependency where you must also remember to set X value in the callback. Currently with results you can say: with your approach you'd say it's not very modular... |
This is redundant...
we can just make anything that is logged available as a callback (ie: always true) but you don't need to add another keyword |
If the issue is about providing callbacks with the module, why not add
Sure, I don't mind that 😃 Exampleclass MyLightningModule(pl.core.LightningModule):
def __init__(
self,
early_stop_monitor,
checkpoint_monitor,
):
super().__init__()
self.early_stop_monitor = early_stop_monitor
self.checkpoint_monitor = checkpoint_monitor
def configure_callbacks(self):
callbacks = []
if self.early_stop_monitor:
callbacks.append(EarlyStopping(monitor=self.early_stop_monitor))
if self.checkpoint_monitor:
# We have a reference to the trainer here so the following is possible
# if we choose to allow Callables as callback monitors:
monitor = lambda: self.checkpoint_monitor if self.trainer.current_epoch % 2 == 0 else "val_loss",
callbacks.append(ModelCheckpoint(monitor=monitor))
return callbacks
def training_step(self, ...):
...
result = pl.TrainResult(minimize=train_loss)
result.log("train_loss", train_loss)
result.log("train_acc", train_acc)
def validation_step(self, ...):
...
result = pl.EvalResult()
result.log("val_loss", val_loss)
result.log("val_acc", val_acc) |
right, i think that's what i was originally hoping for:
but that means we have 2 ways of doing something. so it's why i then thought to just drop the keywords (ie: not do 1 and only 2) |
I love this callbacks hook idea... let's bring it up for vote from the community? (nice suggestion @carmocca!) What do we think? My take is that some models to philosophically NEED certain callbacks. And to keep models self-contained we need to make sure that the required callbacks are packaged as well. i’ve seen this myself in self supervised as well… for instance:
but it doesn’t make sense for people to run these models without those callbacks. so now we have this disconnect between stand-alone modules and training |
I would have both |
💬 Discussion
Since I started using Lightning, I have noticed many users, as well as myself, having trouble with the way metrics are saved for callbacks by the
Result
class.The current design of having only
{early_stop_on,checkpoint_on}
forces different callbacks to use the same monitor. I believe that wanting different monitors forEarlyStopping
,ModelCheckpoint
, andLearningRateLogger
is a very common use case. This is something we will also need to address if we ever want to support multipleModelCheckpoint
callbacks.This is also unintuitive for new users since most callbacks include a
monitor
parameter which becomes completely useless when theResult
object is used. On the other hand, using aResult
object is the new recommended way to structure{training,validation,test}_step()
.Additionally, the approach is not future-proof since we would need to update lightning to include, say,
batchnorm_on
or whatever new technique that becomes widespread.I believe this is an important issue that needs a solution.
Requirements:
We could use this issue to have a productive discussion about how the final design should look and what are the necessary requirements to fit most use cases:
To start, I would say:
monitor
inCallback
s and{early_stop_on,checkpoint_on}
inResult
Own proposal
{early_stop_on,checkpoint_on}
callback
toresult.log()
function (or addresult.callback()
- both would have the same functionality) for users to save metrics on-the-run--monitor
--monitor
type to beOptional[Union[str, Callable[..., str]]
so the following is possible:I am positive you guys have other great ideas.
cc: @williamFalcon @rohitgr7
#2976 (previous discussion)
#3243 (duplicate)
#2908 (related, would require these improvements)
#3254 (related, could use any callback_metric)
#3291
https://forums.pytorchlightning.ai/t/does-evalresults-also-work-with-early-stopping
Probably missing other related issues. Feel free to tag
The text was updated successfully, but these errors were encountered: