Skip to content

Latest commit

 

History

History
30 lines (18 loc) · 1004 Bytes

File metadata and controls

30 lines (18 loc) · 1004 Bytes

Finetuning Mistral-7B using LoRA and DeepSpeed

We finetune Mistral-7B using LoRA and DeepSpeed. We ran LoRA on two 40 GB A100 GPUs utilizing DeepSpeed.

To get started, first install Determined on your local machine:

pip install determined

Then finetune with LoRA:

det e create lora.yaml . 

You can view the actual training code in finetune.py.

Configuration

Change configuration options in lora.yaml. Some important options are:

  • slots_per_trial: the number of GPUs to use.
  • dataset_subset: the difficulty subset to train on.
  • per_device_train_batch_size: the batch size per GPU.

The results in our blog post were obtained using per_device_train_batch_size: 1 and per_device_eval_batch_size: 4

DeepSpeed configuration files are in the ds_configs folder.