Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLm serving] Fix timeout setting bug #2398

Closed
wants to merge 75 commits into from
Closed
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
9e7ef07
add codee
Sep 25, 2023
4aa21bd
add copyright
jiangjiajun Sep 25, 2023
6fd06f7
fix some bugs
jiangjiajun Sep 25, 2023
9f569a7
Update prefix_utils.py
jiangjiajun Sep 25, 2023
bdf2748
Update triton_model.py
jiangjiajun Sep 25, 2023
52aaffb
Update triton_model.py
jiangjiajun Sep 25, 2023
0199cac
fix tokenizer
jiangjiajun Sep 25, 2023
fa151a7
Add check for prefix len
jiangjiajun Sep 25, 2023
800c6a9
Create README.md
jiangjiajun Sep 26, 2023
91ea8bb
Create test_client.py
jiangjiajun Sep 26, 2023
5e8221e
Update task.py
jiangjiajun Sep 26, 2023
9897924
add debug log and fix ptuning
jiangjiajun Sep 26, 2023
8d1e691
update version
jiangjiajun Sep 26, 2023
388eb9b
Update triton_model.py
jiangjiajun Oct 7, 2023
68c15b6
Update README.md
jiangjiajun Oct 8, 2023
30a3beb
Support chatglm-6b (#2223)
jiangjiajun Oct 10, 2023
b96a92b
Support bloom (#2232)
jiangjiajun Oct 11, 2023
80bb8ed
Support multicards (#2234)
jiangjiajun Oct 11, 2023
986b233
[LLM] Add prefix for chatglm (#2233)
rainyfly Oct 12, 2023
9fa04c3
Update engine.py
jiangjiajun Oct 12, 2023
e6a7d4e
[LLM] Fix P-Tuning difference (#2240)
jiangjiajun Oct 13, 2023
51d8697
[LLM] Support prefix for bloom (#2237)
rainyfly Oct 16, 2023
73c1507
Support bloom prefix (#2245)
rainyfly Oct 17, 2023
528e976
[LLM] Fix serving (#2246)
jiangjiajun Oct 18, 2023
1cbbaee
fix chatglm
jiangjiajun Oct 18, 2023
2f2c824
Update config.py
jiangjiajun Oct 18, 2023
66a4897
[LLM] Support bloom prefix (#2248)
rainyfly Oct 19, 2023
4d956d3
[LLM] Add simple client
jiangjiajun Oct 19, 2023
a5a261b
add requirements
jiangjiajun Oct 19, 2023
4c21588
[LLM] Support dynamic batching for chatglm (#2251)
jiangjiajun Oct 20, 2023
8ff24d6
[LLM] Support dybatch for bloom (#2255)
jiangjiajun Oct 20, 2023
3a4f8a9
remove +1 for chatglm
jiangjiajun Oct 20, 2023
e5da0f1
Update setup.py
jiangjiajun Oct 20, 2023
6da9555
Add check for prefix and compatible with lite
jiangjiajun Oct 24, 2023
10eefcb
add requires
jiangjiajun Oct 24, 2023
fb0f276
Support gpt
jiangjiajun Oct 24, 2023
7193337
Fix triton model problem
jiangjiajun Oct 25, 2023
70f8469
Update version
jiangjiajun Oct 25, 2023
b116e3e
Add some tools
jiangjiajun Oct 26, 2023
b86524e
test
Nov 6, 2023
2e6bc1a
Update triton_model.py
jiangjiajun Nov 7, 2023
cabebc3
Update setup.py
jiangjiajun Nov 7, 2023
cdc0ff2
Update README.md
jiangjiajun Nov 7, 2023
7c23864
test FastDeploy
Nov 7, 2023
cca470f
Merge branch 'llm' into llm
karagg Nov 7, 2023
fb5d5c5
test
Nov 8, 2023
2d2274c
[LLM] Add ci test scripts (#2272)
karagg Nov 9, 2023
a55837e
delete run.sh
Nov 14, 2023
f9c8581
Merge branch 'PaddlePaddle:llm' into llm
karagg Nov 14, 2023
1f76abf
delete run.sh
Nov 14, 2023
9c6b2de
update run.sh
Nov 14, 2023
ceb49a4
update run.sh ci.py
Nov 14, 2023
9499199
update ci.py
Nov 15, 2023
8bf70a1
update ci.py
Nov 15, 2023
6e15209
[LLM]update ci test script (#2285)
karagg Nov 15, 2023
be12232
debug
Nov 15, 2023
f884c1a
debug
Nov 15, 2023
57e7608
Merge pull request #2286 from karagg/llm
Zeref996 Nov 15, 2023
7b80d70
debug
Nov 15, 2023
bb68a7e
Merge pull request #2288 from karagg/llm
Zeref996 Nov 16, 2023
6cb1474
debug
Nov 16, 2023
71652e3
Merge pull request #2289 from karagg/llm
Zeref996 Nov 16, 2023
261e519
update run.sh
Nov 17, 2023
836d21f
add comment
Nov 20, 2023
87f53ea
do not merge
Nov 20, 2023
66c4563
Rename test_max_batch_size.sh to test_max_batch_size.py
jiangjiajun Nov 23, 2023
79e6a1e
update
Dec 4, 2023
3376284
Merge pull request #2291 from karagg/llm
Zeref996 Dec 5, 2023
fda8c37
Improve robustness for llm (#2321)
rainyfly Dec 14, 2023
cc89731
detail log for llm (#2325)
rainyfly Dec 14, 2023
67ca253
Fix a bug for llm serving (#2326)
rainyfly Dec 14, 2023
7bddc67
Add warning for server hangs (#2333)
rainyfly Dec 27, 2023
c18abc6
Add fastapi support (#2371)
rainyfly Feb 27, 2024
a843a3c
Add fastapi support (#2383)
rainyfly Feb 28, 2024
6b127d5
Fix timeout setting bug
rainyfly Mar 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions llm/test/ci.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,6 @@ def main():
#分三种情况 bs=1 bs=4 bs=4stop=2
opts = ['bs1', 'bs4', 'bs4-dy']

#清空共享内存
os.system(command='rm -rf /dev/shm')
#创建res文件进行结果存储,若已存在文件则将文件结果删除
res_path = f'{fastdeploy}/llm/res'
if os.path.exists(res_path):
Expand Down Expand Up @@ -126,7 +124,7 @@ def main():
no_PT.append(diff_rate)
os.system(command=f"rm -f {res_path}/*")
os.system(command=f"rm -f real_time_save.temp_ids_rank_0_step_*")
os.system(command="rm -rf /dev/shm/*")

with open(out_path, 'a+') as f:
#f.write(f"{noptuning_model_name[model_index]}\t\t{no_PT[0]}\t\t{no_PT[1]}\t\t{no_PT[2]}\n")
f.write('%-30s%-30s%-30s%-30s\n' %
Expand Down Expand Up @@ -166,7 +164,7 @@ def main():
PT.append(diff_rate)
os.system(command=f"rm -f {res_path}/*")
os.system(command=f"rm -f real_time_save.temp_ids_rank_0_step_*")
os.system(command="rm -rf /dev/shm/*")

with open(out_path, 'a+') as f:
#f.write(f"{ptuning_model_name[model_index]}\t\t否\t\t{PT[0]}\t\t{PT[1]}\t\t{PT[2]}\n")
f.write('%-30s%-30s%-30s%-30s%-30s\n' % (
Expand All @@ -193,7 +191,6 @@ def main():
pre_PT.append(diff_rate)
os.system(command=f"rm -f {res_path}/*")
os.system(command=f"rm -f real_time_save.temp_ids_rank_0_step_*")
os.system(command="rm -rf /dev/shm/*")

with open(out_path, 'a+') as f:
#f.write(f"{ptuning_model_name[model_index]}\t\t是\t\t{pre_PT[0]}\t\t{pre_PT[1]}\t\t{pre_PT[2]}\n")
Expand Down
7 changes: 4 additions & 3 deletions llm/test/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

current_directory=$PWD

#环境安装 主要是安装paddlenlp算子
#环境安装 主要是wget的安装和paddlenlp算子
pip install wget
cd ${paddlenlp}/csrc
${py_version} setup_cuda.py install --user

Expand Down Expand Up @@ -40,8 +41,8 @@ done
mkdir inference_model
cd ${paddlenlp}/llm
for((i=0;i<${#export_model_name[*]};i++));do
${py_version} export_model.py --model_name_path ${export_model_name[i]} --output_path ${current_directory}/inference_model/${noptuning_model_name[i]} --dtype float16 --inference_model
${py_version} export_model.py --model_name_path ${export_model_name[i]} --output_path ${current_directory}/inference_model/${ptuning_model_name[i]} --dtype float16 --inference_model --export_precache 1
${py_version} export_model.py --model_name_or_path ${export_model_name[i]} --output_path ${current_directory}/inference_model/${noptuning_model_name[i]} --dtype float16 --inference_model
${py_version} export_model.py --model_name_or_path ${export_model_name[i]} --output_path ${current_directory}/inference_model/${ptuning_model_name[i]} --dtype float16 --inference_model --export_precache 1
done
cd $current_directory
#开启测试
Expand Down