You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have not idea what this line is used for, but this unwrap DDP module so that the training process become unsynchronized, i.e, no gradient communication in multi-gpu training, each node trains independently with part of the data.
I ensure this by adding sleep in one of the worker and finding no hang in the main training process. And by deleting this line the job got properly synchronized.
The text was updated successfully, but these errors were encountered:
sd-scripts/fine_tune.py
Line 247 in 2a23713
I have not idea what this line is used for, but this unwrap DDP module so that the training process become unsynchronized, i.e, no gradient communication in multi-gpu training, each node trains independently with part of the data.
I ensure this by adding sleep in one of the worker and finding no hang in the main training process. And by deleting this line the job got properly synchronized.
The text was updated successfully, but these errors were encountered: