RewardTrainer fails with FSDP #1195

mgerstgrasser · 2024-01-09T00:41:05Z

I've just run into an odd issue with FSDP & RewardTrainer. It seems then when using FSDP, the output of the (sequence classification) model's forward function isn't as expected.
Normally, it returns a SequenceClassifierOutputWithPast where logits contains a tensor with the logits, and loss is empty or contains some sort of generator object..
When using FSDP, I'm getting a dict inside the loss field (and oddly enough that dict again contains a single key logits, althouh that's not the issue).

Not sure why this happens, but the net effect is that when the RewardTrainer tries to get the logits through model(...)[0] (see here), in the non-FSDP case it gets the logits, while in the FSDP case it gets the dict from the now non-emptyloss field, and then fails a few lines later.

Two questions:

This is easily fixed by doing model(...)["logits"] instead. Any problem with doing that?
Purely out of curiosity, does anyone know why this behaves differently with FSDP?

To reproduce: Run examples/scripts/reward_modeling.py with accelerate + FSDP.

forward output in a single process:

SequenceClassifierOutputWithPast(loss=<generator object gather.<locals>.gather_map.<locals>.<genexpr> at 0x15360f993040>, logits=tensor([[...]], device='cuda:0', grad_fn=<GatherBackward>), past_key_values=None, hidden_states=None, attentions=None)

And in FSDP:

SequenceClassifierOutputWithPast(loss={'logits': tensor([[...]], device='cuda:1', grad_fn=<ToCopyBackward0>)}, logits=tensor([[...]], device='cuda:1', grad_fn=<ToCopyBackward0>), past_key_values=None, hidden_states=None, attentions=None)

The text was updated successfully, but these errors were encountered:

younesbelkada · 2024-01-09T05:49:47Z

thanks for the deepdive, I left a suggestion on the PR, lmk what do you think

mgerstgrasser mentioned this issue Jan 9, 2024

Fix FSDP error #1196

Merged

younesbelkada closed this as completed in #1196 Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RewardTrainer fails with FSDP #1195

RewardTrainer fails with FSDP #1195

mgerstgrasser commented Jan 9, 2024

younesbelkada commented Jan 9, 2024

RewardTrainer fails with FSDP #1195

RewardTrainer fails with FSDP #1195

Comments

mgerstgrasser commented Jan 9, 2024

younesbelkada commented Jan 9, 2024