[SW-198498] pass "lazy_mode" arg to GaudiLlamaModel GaudiTrainer

Problem: TrainingArgs.use_lazy_mode is not used by GaudiLlamaModel Cause: lazy_mode argument was not passed by GaudiTrainer Solution: Added missing argument to inputs in GaudiTrainer._inner_training_loop Change-Id: I956023956af3d7962b24be53ec74d20e6bb56bd6
huggingface · Sep 27, 2024 · 6f41803 · 6f41803
1 parent b6a2f68
commit 6f41803
Showing 1 changed file with 3 additions and 1 deletion.
diff --git a/optimum/habana/transformers/trainer.py b/optimum/habana/transformers/trainer.py
@@ -982,7 +982,9 @@ def hpu_deepspeed_checkpointing(function, *checkpoint_args, use_reentrant: Optio
                             inputs["flash_attention_recompute"] = True
                         if self.model.generation_config.flash_attention_causal_mask:
                             inputs["flash_attention_causal_mask"] = True
-
+                if self.model.config is not None:
+                    if self.model.config.model_type in ["llama", "qwen2", "mistral", "starcoder2"]:
+                        inputs["lazy_mode"] = args.use_lazy_mode
                 # TODO: keep syncs for fast DDP?
                 with self.accelerator.accumulate(model):
                     tr_loss_step = self.training_step(model, inputs)