The code summarization dataset is located in data/code_summarization_data.jsonl
. The fields of the data are explained below:
field | description |
---|---|
id | the local id of items in the dataset |
source_code | a code snippet that perform some functionaliy |
lang_cluster | the programming language of the source code |
human_summarization | a piece of reference nl summarization for the source code |
cd code_summarization
- install
python>=3.9
(we usepython==3.9
) - install
torch
(we suggesttorch==2.1.1
) based on your cuda version pip install -r requirements.txt
Run the inference scripts to get the inference results of the targeted LLMs. The inference results code_summ_data_{modelname}.jsonl
will be saved under the inference/results
folder. The inference logs code_summ_log_{model_name}.log
will be saved under the inference/logs
folder.
We provide the following closed-sourced LLMs inference scripts for you:
Model Name | Model Version | Script Name |
---|---|---|
PaLM 2 | text-bison-001 | run_palm2.py |
GPT-4 | gpt-4-0613 | run_gpt.py |
GPT-3.5 | gpt-3.5-turbo-0613 | run_gpt.py |
For PaLM 2, you can run the following command by replacing google_api_key
with your own PaLM API key.
python run_palm.py --api_key google_api_key --data_load_name data/code_summarization_data.jsonl --result_save_name code_summ_infer_palm.jsonl --log_file_name code_summ_infer_palm.log
For GPT-4 and GPT-3.5, you can run the following command by replacing openai_api_key
with your own OpenAI API key, model_version
with specific model version.
python run_gpt.py --api_key openai_api_key --model gpt-4-0613 --data_load_name data/code_summarization_data.jsonl --result_save_name code_summ_infer_gpt4.jsonl --log_file_name code_summ_infer_gpt4.log
python run_gpt.py --api_key openai_api_key --model gpt-3.5-turbo-0613 --data_load_name data/code_summarization_data.jsonl --result_save_name code_summ_infer_gpt3.jsonl --log_file_name code_summ_infer_gpt3.log
We provide the following open-sourced LLMs inference scripts for you:
Model Name | Model Checkpoint | Script Name |
---|---|---|
Code LLaMA | codellama/CodeLlama-34b-Instruct-hf | run_codellama.py |
LLaMA 2 | meta-llama/Llama-2-70b-chat-hf | run_llama2.py |
StarCoder | HuggingFaceH4/starchat-beta | run_starcoder.py |
Vicuna | lmsys/vicuna-13b-v1.5-16k | run_vicuna.py |
WizardCoder | WizardLM/WizardCoder-15B-V1.0 | run_wizardcoder.py |
For open-sourced HuggingFace models, you can run the following command by replacing huggingface_access_token
with your own HuggingFace access token, cache_dir
with path to a directory in which a downloaded pretrained model and tokenizer should be cached, model_checkpoint
with specific model checkpoint.
python inference/run_{model_name}.py --access_token huggingface_access_token --cache_dir cache_dir --checkpoint model_checkpoint --data_load_name data/code_summarization_data.jsonl --result_save_name code_summ_infer_{model_name}.jsonl --log_file_name code_summ_infer_{model_name}.log
An example of running the Code LLaMA model is:
python run_codellama.py --access_token access_token --cache_dir cache_dir --checkpoint codellama/CodeLlama-34b-Instruct-hf --data_load_name data/code_summarization_data.jsonl --result_save_name code_summ_infer_codellama.jsonl --log_file_name code_summ_infer_codellama.log
cd ../evaluator
- Execute the command
python eval_code_summarization.py --llm_infer_result infer_file
, substituting 'infer_file' with the name of the llm's inference file (for example, 'code_summ_infer_palm.jsonl'). This will generate scores for the targeted llm's inference results, which will be saved in the directoryevaluator/summ_scores/
.