Infer failed: Unable to parse 'data': Shape does not match true shape of 'data' field in generate endpoint #369

bprus · 2024-03-08T13:54:54Z

System Info

CPU architecture: x86_64
GPU: NVIDIA A10 24GB
TensorRT-LLM: v0.8.0 (docker build via make -C docker release_build CUDA_ARCHS="86-real")
Triton Inference Server: r24.02 (docker from NGC)
OS: Ubuntu 22.04

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

I follow official examples for Llama model: https://github.com/NVIDIA/TensorRT-LLM/tree/v0.8.0/examples/llama
I'm able to set everything up, and everything runs smoothly when using the ensemble model:

curl -X POST localhost:8000/v2/models/ensemble/generate -d '{"text_input": "What is", "max_tokens": 1000}'

and the response is:

{"context_logits":0.0,"cum_log_probs":0.0,"generation_logits":0.0,"model_name":"ensemble","model_version":"1","output_log_probs":[0.0,...,0.0],"sequence_end":false,"sequence_id":0,"sequence_start":false,"text_output":"the purpose of the meeting? What are the key issues to be discussed? What are the desired outcomes or decisions to be made?\n\n2. Identify the key stakeholders: Who are the key people that need to be involved in the meeting? What are their roles and responsibilities? What are their interests and perspectives?\n\n3. Determine the meeting format: Will the meeting be formal or informal? Will it be a presentation-style meeting or a discussion-style meeting? What is the appropriate level of formality and structure for the meeting?\n\n4. Choose a suitable location: Where will the meeting be held? Is the location easily accessible and comfortable for all attendees?\n\n5. Establish a clear agenda: What specific topics will be discussed during the meeting? What are the desired outcomes or decisions to be made? What are the key points to be covered?\n\n6. Set a time limit: How long will the meeting last? What is the appropriate length of time for the meeting?\n\n7. Identify any necessary materials: What materials or information will be needed during the meeting? Will any presentations or handouts be needed?\n\n8. Choose a suitable time: What is the best time for the meeting? Will all attendees be available at that time?\n\n9. Establish a clear communication plan: How will the meeting be conducted? Will it be in person, via video conference, or via phone? What is the appropriate communication method for the meeting?\n\n10. Identify any necessary follow-up actions: What actions need to be taken after the meeting? Who is responsible for taking these actions? What are the timelines for these actions?\n\nBy following these steps, you can ensure that your meetings are well-planned, productive, and effective."}

I'm also able to run preprocessing model:

curl -X POST localhost:8000/v2/models/preprocessing/generate -d '{"QUERY": "What is", "REQUEST_OUTPUT_LEN": 1000}'

and the response is:

{"BAD_WORDS_IDS":[],"EMBEDDING_BIAS":[],"INPUT_ID":[1724,338],"OUT_END_ID":2,"OUT_PAD_ID":2,"REQUEST_INPUT_LEN":2,"REQUEST_OUTPUT_LEN":1000,"STOP_WORDS_IDS":[],"model_name":"preprocessing","model_version":"1"}

Then, when I try to query the tensorrt_llm model directly:

curl -i -X POST localhost:8000/v2/models/tensorrt_llm/generate -d '{"input_ids": [1724, 338], "input_lengths": 2, "request_output_len": 1000}'

I get error:

{"error":"Unable to parse 'data': Shape does not match true shape of 'data' field"}

Triton runs with debug logs on and there is no more information there:

I0308 13:45:35.029834 96 http_server.cc:4523] HTTP request: 2 /v2/models/tensorrt_llm/generate
I0308 13:45:35.029876 96 model_lifecycle.cc:336] GetModel() 'tensorrt_llm' version -1
I0308 13:45:35.029884 96 model_lifecycle.cc:294] VersionStates() 'tensorrt_llm'
I0308 13:45:35.029980 96 model_lifecycle.cc:336] GetModel() 'tensorrt_llm' version -1
I0308 13:45:35.030004 96 http_server.cc:3241] [request id: <id_unknown>] Infer failed: Unable to parse 'data': Shape does not match true shape of 'data' field

I tried many different request versions, trying to wrap values in lists, etc. without any success.

What I found is that it works if input_ids is one element:

curl -i -X POST localhost:8000/v2/models/tensorrt_llm/generate -d '{"input_ids": [1724], "input_lengths": 1, "request_output_len": 1000}'

with response:

{"context_logits":0.0,"cum_log_probs":0.0,"generation_logits":0.0,"model_name":"tensorrt_llm","model_version":"1","output_ids":[1,13,13,29902,505,263,1108,411,590,6601,29889,739,29915,29879,263,360,514,512,1028,381,265,29871,29896,29945,29871,29945,29900,29900,29900,3652,19022,322,372,29915,29879,1048,29871,29906,2440,2030,29889,450,1108,338,393,278,12247,1250,4366,338,451,1985,29889,306,505,1898,278,9670,18835,1251,11427,6576,763,10092,1259,278,12247,11785,2264,29892,8454,278,12247,6055,29892,322,1584,15270,278,12247,7156,29892,541,3078,2444,304,664,29889,13,13,29902,505,884,1898,773,263,1422,12247,3578,7047,29892,541,393,2086,1838,29915,29873,664,29889,306,505,884,7120,278,350,25925,6055,322,278,12247,6055,297,278,350,25925,526,731,304,376,2951,1642,306,505,884,1898,773,263,1422,3081,2752,29892,541,393,2086,1838,29915,29873,664,29889,13,13,29902,626,472,590,12309,29915,29879,1095,322,306,1016,29915,29873,1073,825,304,437,29889,3529,1371,592,29889,13,13,5634,13,13,18567,727,29991,8221,304,8293,393,596,19022,29915,29879,12247,1250,4366,338,451,1985,29889,16564,373,278,2472,366,29915,345,4944,29892,372,2444,763,263,12837,2228,29889,2266,526,777,7037,6851,366,1033,1018,29901,13,13,29896,29889,5399,278,12247,1826,2801,29901,8561,1854,278,12247,1826,2801,338,11592,368,18665,3192,297,322,451,23819,29889,960,372,29915,29879,23819,29892,1018,18665,3460,372,297,1449,29889,13,29906,29889,5399,278,12247,8525,21387,29901,450,12247,8525,21387,338,14040,363,18750,5367,3081,322,848,304,278,12247,29889,960,278,8525,21387,338,5625,4063,470,23819,29892,372,508,4556,5626,411,278,12247,1250,4366,29889,5399,278,8525,21387,322,1207,1854,372,29915,29879,11592,368,6631,304,278,5637,3377,29889,13,29941,29889,5399,278,12247,1250,4366,11369,29901,450,12247,1250,4366,11369,338,5491,5982,373,278,5637,3377,29889,5399,278,11369,322,1207,1854,372,29915,29879,451,5625,4063,470,23819,29889,13,29946,29889,5399,363,19786,470,316,1182,275,29901,360,504,470,316,1182,275,508,18414,5987,373,278,12247,322,4556,5626,411,278,1250,4366,29889,4803,419,13120,4799,304,5941,714,738,19786,470,316,1182,275,515,278,12247,322,278,12247,1250,4366,11369,29889,13,29945,29889,5399,363,23819,470,5625,4063,6611,29901,960,263,1820,338,23819,470,5625,4063,29892,372,508,4556,5626,411,278,12247,1250,4366,29889,5399,1269,1820,322,1207,1854,896,29915,276,599,11592,368,10959,304,278,12247,29889,13,29953,29889,3967,263,1422,12247,29901,960,5642,310,278,2038,6851,664,29892,372,29915,29879,1950,393,278,12247,3528,338,5625,4063,29889,3967,773,263,1422,12247,304,1074,565,278,2228,338,11527,29889,13,13,3644,5642,310,1438,6851,664,29892,372,29915,29879,1950,393,278,2228,338,411,278,19022,29915,29879,12837,322,451,411,278,12247,470,7047,29889,512,445,1206,29892,366,1122,817,304,6958,360,514,2304,363,4340,18872,29889,13,13,29902,4966,445,6911,29991,2803,592,1073,565,366,505,738,916,5155,29889,2,1,917,29901,274,6552,281,7810,29892,848,29899,19672,29892,25209,10855,13,13,16492,29901,1128,304,7868,385,20215,7196,304,263,422,17801,297,23678,29973,13,13,29902,505,385,20215,7196,310,3618,393,306,864,304,7868,304,263,422,17801,297,23678,29889,306,505,1898,773,278,421,6913,4435,29952,2875,310,278,422,17801,29892,541,278,4333,338,451,1641,4784,746,278,3618,526,2715,470,6206,29889,13,13,10605,338,590,1060,23956,775,29901,13,13,29905,463,29912,401,29913,13,29966,26628,25085,4435,10724,9270,1619,27928,7196,29913,1013,13,1678,529,26628,29889,2001,6733,29958,13,4706,529,1469,6733,29958,13,9651,529,1626,7445,3992,10724,9270,4408,29913,4681,13,4706,1533,1469,6733,29958,13,1678,1533,26628,29889,2001,6733,29958,13,829,26628,29958,13,29905,355,29912,401,29913,13,13,2855,1244,338,590,315,29937,775,29901,13,13,29905,463,29912,401,29913,13,3597,20215,7196,29966,3421,2061,29958,1619,27928,7196,426,679,29936,731,29936,500,13,13,458,2023,13,13,3597,1780,3462,2061,29898,3421,2061,5446,29897,13,29912,13,1678,1619,27928,7196,29889,2528,29898,5415,416,13,29913,13,13,3597,1780,15154,2061,29898,3421,2061,5446,29897,13,29912,13,1678,1619,27928,7196,29889,15941,29898,5415,416,13,29913,13,29905,355,29912,401,29913,13,13,29902,505,884,1898,773,278,421,27928,7196,29952,297,278,421,1469,2677,29952,310,278,422,17801,29892,541,393,884,947,451,664,29889,13,13,6028,4856,3113,1371,592,4377,714,825,306,626,2599,2743,29973,13,13,22550,29901,887,817,304,731,278,421,6913,4435,29952,2875,310,278,421,26628,29952,304,278,421,27928,7196,29952,322,451,278,421,9270,1412,13,13,10605,338,278,24114,1060,23956,775,29901,13,13,29905,463,29912,401,29913,13,29966,26628,25085,4435,10724,9270,1619,27928,7196,29913,1013,13,1678,529,26628,29889,2001,6733,29958,13,4706,529,1469,6733,29958,13,9651,529,1626,7445,3992,10724,9270,4408,29913,4681,13,4706,1533,1469,6733,29958,13,1678,1533,26628,29889,2001,6733,29958,13,829,26628,29958,13,29905,355,29912,401,29913,13,13,17351,29892,1207,1854,393,596,421,3421,27928,7196,29952,338,6284,16601,322,393,366,526,13271,372,5149,29889,13],"output_log_probs":[0.0,...0.0],"sequence_length":1000}

Moreover, I'm able to query the infer endpoint successfully, like:

curl -i -X POST localhost:8000/v2/models/tensorrt_llm/infer -d \
'{"inputs": [{"name" : "input_ids", "shape" : [ 1, 2 ], "datatype" : "INT32", "data" : [1724,338] }, {"name" : "input_lengths", "shape" : [1, 1], "datatype" : "INT32", "data" : [2] }, {"name" : "request_output_len", "shape" : [1, 1], "datatype" : "INT32", "data" : [1000] }]}'

with response:

{"model_name":"tensorrt_llm","model_version":"1","outputs":[{"name":"output_ids","datatype":"INT32","shape":[1,1,1000],"data":[278,6437,310,278,11781,29973,1724,526,278,1820,5626,304,367,15648,29973,1724,526,278,7429,714,26807,470,1602,12112,304,367,1754,29973,13,13,29906,29889,13355,1598,278,1820,380,1296,8948,414,29901,11644,526,278,1820,2305,393,817,304,367,9701,297,278,11781,29973,1724,526,1009,16178,322,5544,747,9770,29973,1724,526,1009,20017,322,3736,1103,3145,29973,13,13,29941,29889,5953,837,457,278,11781,3402,29901,2811,278,11781,367,11595,470,1871,284,29973,2811,372,367,263,24329,29899,3293,11781,470,263,10679,29899,3293,11781,29973,1724,338,278,8210,3233,310,883,2877,322,3829,363,278,11781,29973,13,13,29946,29889,14542,852,263,13907,4423,29901,6804,674,278,11781,367,4934,29973,1317,278,4423,5948,15579,322,25561,363,599,472,841,311,267,29973,13,13,29945,29889,2661,370,1674,263,2821,946,8395,29901,1724,2702,23820,674,367,15648,2645,278,11781,29973,1724,526,278,7429,714,26807,470,1602,12112,304,367,1754,29973,1724,526,278,1820,3291,304,367,10664,29973,13,13,29953,29889,3789,263,931,4046,29901,1128,1472,674,278,11781,1833,29973,1724,338,278,8210,3309,310,931,363,278,11781,29973,13,13,29955,29889,13355,1598,738,5181,17279,29901,1724,17279,470,2472,674,367,4312,2645,278,11781,29973,2811,738,2198,800,470,1361,17718,367,4312,29973,13,13,29947,29889,14542,852,263,13907,931,29901,1724,338,278,1900,931,363,278,11781,29973,2811,599,472,841,311,267,367,3625,472,393,931,29973,13,13,29929,29889,2661,370,1674,263,2821,12084,3814,29901,1128,674,278,11781,367,18043,29973,2811,372,367,297,2022,29892,3025,4863,21362,29892,470,3025,9008,29973,1724,338,278,8210,12084,1158,363,278,11781,29973,13,13,29896,29900,29889,13355,1598,738,5181,1101,29899,786,8820,29901,1724,8820,817,304,367,4586,1156,278,11781,29973,11644,338,14040,363,5622,1438,8820,29973,1724,526,278,5335,24210,363,1438,8820,29973,13,13,2059,1494,1438,6576,29892,366,508,9801,393,596,5870,886,526,1532,29899,572,11310,29892,3234,573,29892,322,11828,29889,2,1,8778,1405,10130,1405,12157,719,4785,1575,1405,450,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,13,1576,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,13,1576,1494,338,385,13661,7314,515,2259,317,29889,438,433,600,29892,22760,11289,6673,322,9087,4978,12139,472,323,29902,6344,29899,22245,29943,29901,13,8015,19211,18987,338,263,4280,322,4100,1889,29892,322,18161,25228,943,1708,263,12187,6297,297,19912,1009,13154,6176,1009,3240,19211,14433,29889,13,2887,278,24354,1045,12392,12623,22170,3240,19211,5046,29892,278,817,363,6047,3240,19211,18987,756,2360,1063,901,24795,29889,7579,304,263,7786,18994,491,278,13377,2955,4649,19211,8907,29892,29871,29955,29945,29995,310,23035,505,451,7160,3307,363,3240,19211,29892,322,29871,29953,29900,29995,310,3240,533,267,19104,373,10307,14223,363,263,13638,310,1009,17869,29889,13,12881,273,1455,25228,943,508,1371,1009,13154,23624,278,4280,1907,310,3240,19211,18987,491,13138,7333,1891,9848,29892,27032,263,3464,310,13258,358,3987,29892,322,19912,13154,7952,373,5702,411,1009,1472,29899,8489,18161,14433,29889,13,10605,526,777,1820,5837,393,18161,25228,943,508,1371,1009,13154,411,3240,19211,18987,29901,13,29896,29889,4007,404,292,18161,1303,3335,29901,2087,1730,943,508,1371,13154,24809,1009,1857,18161,6434,29892,3704,1009,17869,29892,1518,11259,29892,21608,29892,322,2553,1372,29892,304,8161,565,896,526,373,5702,363,263,25561,3240,19211,29889,13,29906,29889,21605,1855,4695,14433,29901,2087,1730,943,508,1371,13154,731,1855,4695,3240,19211,14433,29892,1316,408,278,5046,896,864,304,3240,533,29892,1009,7429,301,7004,1508,297,3240,19211,29892,322,920,1568,896,817,304,4078,304,6176,1009,14433,29889,13,29941,29889,10682,292,263,3240,19211,17869,3814,29901,2087,1730,943,508,1371,13154,2693,263,15171,6270,3240,19211,17869,3814,393,7805,263,10296,310,8974,29892,1316,408,10307,14223,29892,282,5580,29892,3240,19211,15303,29892,322,916,13258,1860,29889,13,29946,29889,2315,6751,13258,358,12045,29901,2087,1730,943,508,1371,13154,10933,13258,358,12045,491,13138,263,3464,310,13258,358,3987,393,526,8210,363,1009,5046,29892,12045,20341,749,29892,322,3240,19211,14433,29889,13,29945,29889,9133,4821,373,17696,2304,29901,2087,1730,943,508,3867,373,17696,2304,322,27323,304,1371,13154,7952,373,5702,411,1009,3240,19211,14433,29892,322,1207,10365,1860,408,4312,29889,13,12881,273,1455,25228,943,1708,263,12187,6297,297,19912,1009,13154,6176,1009,3240,19211,14433,29892,322,491,13138,7333,1891,9848,29892,27032,263,3464,310,13258,358,3987,29892,322,19912,13154,7952,373,5702,411,1009,1472,29899,8489,18161,14433,29892,25228,943,508,1371,1009,13154,6176,263,25561,322,11592,3240,19211,29889,13,8176,3192,29901,2087,19188,29892,4231,273,1455,2087,19188,29892,4649,19211,29892,4649,19211,1858,9450,13,30009,450,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,2,1,8778,1405,10130,1405,12157,719,4785,1575,1405,450,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,13,1576,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,13,1576,1494,338,385,13661,7314,515,2259]},{"name":"sequence_length","datatype":"INT32","shape":[1,1],"data":[1000]},{"name":"context_logits","datatype":"FP32","shape":[1,1,1],"data":[0.0]},{"name":"generation_logits","datatype":"FP32","shape":[1,1,1,1],"data":[0.0]},{"name":"output_log_probs","datatype":"FP32","shape":[1,1,1000],"data":[0.0,...,0.0]},{"name":"cum_log_probs","datatype":"FP32","shape":[1,1],"data":[0.0]}]}

I guess it's something simple and I'm querying the endpoint in a wrong way, but I really can't find a solution. Any help would be appreciated.

Expected behavior

generate endpoint returns correct results.
Error message is more meaningful.

actual behavior

generate endpoint throws an error.

additional notes

The text was updated successfully, but these errors were encountered:

PKaralupov · 2024-08-07T17:34:45Z

@schetlur-nv I also have the same problem

v-shobhit · 2024-09-17T18:03:37Z

+1 on facing this problem

bprus added the bug Something isn't working label Mar 8, 2024

byshiue assigned schetlur-nv Mar 13, 2024

v-shobhit mentioned this issue Sep 18, 2024

fix: usage of ReadDataFromJson in array tensors triton-inference-server/server#7624

Merged

22 tasks

pskiran1 closed this as completed in triton-inference-server/server#7624 Oct 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infer failed: Unable to parse 'data': Shape does not match true shape of 'data' field in generate endpoint #369

Infer failed: Unable to parse 'data': Shape does not match true shape of 'data' field in generate endpoint #369

bprus commented Mar 8, 2024

PKaralupov commented Aug 7, 2024

v-shobhit commented Sep 17, 2024

Infer failed: Unable to parse 'data': Shape does not match true shape of 'data' field in generate endpoint #369

Infer failed: Unable to parse 'data': Shape does not match true shape of 'data' field in generate endpoint #369

Comments

bprus commented Mar 8, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

PKaralupov commented Aug 7, 2024

v-shobhit commented Sep 17, 2024