We finetune Mistral-7B using LoRA and DeepSpeed. We ran LoRA on two 40 GB A100 GPUs utilizing DeepSpeed.
To get started, first install Determined on your local machine:
pip install determined
Then finetune with LoRA:
det e create lora.yaml .
You can view the actual training code in finetune.py
.
Change configuration options in lora.yaml
. Some important options are:
slots_per_trial
: the number of GPUs to use.dataset_subset
: the difficulty subset to train on.per_device_train_batch_size
: the batch size per GPU.
The results in our blog post were obtained using per_device_train_batch_size: 1
and per_device_eval_batch_size: 4
DeepSpeed configuration files are in the ds_configs
folder.