Skip to content

Latest commit

 

History

History
executable file
·
98 lines (75 loc) · 3.82 KB

gft_fit.md

File metadata and controls

executable file
·
98 lines (75 loc) · 3.82 KB

gft (general fine-tuning): gft_fit

Most gft programs are short (1-liners). While gft supports most arguments in most HuggingFace and PaddleNLP examples, most gft programs specify 4 arguments:

The following example of gft_fit takes a pre-trained model, bert-base-cased (from HuggingFace), as input, and outputs a post-trained model to $outdir. ($outdir should be a place with plenty of disk space, because the output models are as large as the input pre-trained models.)

This example uses the glue dataset (from HuggingFace) to fine-tune (or fit) the pre-trained model for the qqp task.

gft_fit --model H:bert-base-cased \
    --data H:glue,qqp \
    --metric H:glue,qqp \
    --output_dir $outdir \
    --eqn 'classify: label ~ question1 + question2'

Many examples of gft_fit can be found here.

One of the design goals of gft is to make it easy to mix and match models and datasets from different suppliers. The following uses models and data from PaddleNLP (as opposed to HuggingFace).

gft_fit --model P:bert-base-cased \
    --data P:glue,qqp \
    --metric H:glue,qqp \
    --output_dir $outdir \
    --eqn 'classify: labels ~ sentence1 + sentence2' \

The variables in the equations refer to columns in the datasets. The equations are slightly different in two gft programs above because different suppliers of glue data use different names for columns.

There are 4 examples of the glue tasks (for 4 combinations of suppliers of datasets and models):

ModelDataExample
HHhere
HPhere
PHhere
PPhere

The following will run one of these examples:

export datasets=$gft/datasets
outdir=/tmp/cola/cpkt
sh $gft/examples/fit_examples/model.HuggingFace/language/data.HuggingFace/glue/cola.sh $outdir

To run all fit examples:

cd $gft/examples/fit_examples
find . -name '*.sh' |
while read f
do
b=$checkpoints/`dirname $f`/`basename $f .sh`
sh $f $b/ckpt
done

One of the design goals of gft is to make fine-tuning as accessible to a broad audience as possible. It should be as easy to fine-tune a deep net as it is to fit a regression model.

gft equations are similar to glm (general linear models) equations in regression packages such as in R.

All of the shell scripts under fine_tuning_examples take a single argument (a directory for the results).

The shell scripts under model.HuggingFace use models from HuggingFace, and shell scripts under model.PaddleHub use models from PaddleHub and/or PaddleNLP. Similarly, shell scripts under data.HuggingFace use datasets from HuggingFace, and shell scripts under data.PaddleHub use datasets from PaddleHub and/or PaddleNLP.