Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer failed: Unable to parse 'data': Shape does not match true shape of 'data' field in generate endpoint #369

Closed
2 of 4 tasks
bprus opened this issue Mar 8, 2024 · 2 comments · Fixed by triton-inference-server/server#7624
Assignees
Labels
bug Something isn't working

Comments

@bprus
Copy link

bprus commented Mar 8, 2024

System Info

  • CPU architecture: x86_64
  • GPU: NVIDIA A10 24GB
  • TensorRT-LLM: v0.8.0 (docker build via make -C docker release_build CUDA_ARCHS="86-real")
  • Triton Inference Server: r24.02 (docker from NGC)
  • OS: Ubuntu 22.04

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

I follow official examples for Llama model: https://github.com/NVIDIA/TensorRT-LLM/tree/v0.8.0/examples/llama
I'm able to set everything up, and everything runs smoothly when using the ensemble model:

curl -X POST localhost:8000/v2/models/ensemble/generate -d '{"text_input": "What is", "max_tokens": 1000}'

and the response is:

{"context_logits":0.0,"cum_log_probs":0.0,"generation_logits":0.0,"model_name":"ensemble","model_version":"1","output_log_probs":[0.0,...,0.0],"sequence_end":false,"sequence_id":0,"sequence_start":false,"text_output":"the purpose of the meeting? What are the key issues to be discussed? What are the desired outcomes or decisions to be made?\n\n2. Identify the key stakeholders: Who are the key people that need to be involved in the meeting? What are their roles and responsibilities? What are their interests and perspectives?\n\n3. Determine the meeting format: Will the meeting be formal or informal? Will it be a presentation-style meeting or a discussion-style meeting? What is the appropriate level of formality and structure for the meeting?\n\n4. Choose a suitable location: Where will the meeting be held? Is the location easily accessible and comfortable for all attendees?\n\n5. Establish a clear agenda: What specific topics will be discussed during the meeting? What are the desired outcomes or decisions to be made? What are the key points to be covered?\n\n6. Set a time limit: How long will the meeting last? What is the appropriate length of time for the meeting?\n\n7. Identify any necessary materials: What materials or information will be needed during the meeting? Will any presentations or handouts be needed?\n\n8. Choose a suitable time: What is the best time for the meeting? Will all attendees be available at that time?\n\n9. Establish a clear communication plan: How will the meeting be conducted? Will it be in person, via video conference, or via phone? What is the appropriate communication method for the meeting?\n\n10. Identify any necessary follow-up actions: What actions need to be taken after the meeting? Who is responsible for taking these actions? What are the timelines for these actions?\n\nBy following these steps, you can ensure that your meetings are well-planned, productive, and effective."}

I'm also able to run preprocessing model:

curl -X POST localhost:8000/v2/models/preprocessing/generate -d '{"QUERY": "What is", "REQUEST_OUTPUT_LEN": 1000}'

and the response is:

{"BAD_WORDS_IDS":[],"EMBEDDING_BIAS":[],"INPUT_ID":[1724,338],"OUT_END_ID":2,"OUT_PAD_ID":2,"REQUEST_INPUT_LEN":2,"REQUEST_OUTPUT_LEN":1000,"STOP_WORDS_IDS":[],"model_name":"preprocessing","model_version":"1"}

Then, when I try to query the tensorrt_llm model directly:

curl -i -X POST localhost:8000/v2/models/tensorrt_llm/generate -d '{"input_ids": [1724, 338], "input_lengths": 2, "request_output_len": 1000}'

I get error:

{"error":"Unable to parse 'data': Shape does not match true shape of 'data' field"}

Triton runs with debug logs on and there is no more information there:

I0308 13:45:35.029834 96 http_server.cc:4523] HTTP request: 2 /v2/models/tensorrt_llm/generate
I0308 13:45:35.029876 96 model_lifecycle.cc:336] GetModel() 'tensorrt_llm' version -1
I0308 13:45:35.029884 96 model_lifecycle.cc:294] VersionStates() 'tensorrt_llm'
I0308 13:45:35.029980 96 model_lifecycle.cc:336] GetModel() 'tensorrt_llm' version -1
I0308 13:45:35.030004 96 http_server.cc:3241] [request id: <id_unknown>] Infer failed: Unable to parse 'data': Shape does not match true shape of 'data' field

I tried many different request versions, trying to wrap values in lists, etc. without any success.

What I found is that it works if input_ids is one element:

curl -i -X POST localhost:8000/v2/models/tensorrt_llm/generate -d '{"input_ids": [1724], "input_lengths": 1, "request_output_len": 1000}'

with response:

{"context_logits":0.0,"cum_log_probs":0.0,"generation_logits":0.0,"model_name":"tensorrt_llm","model_version":"1","output_ids":[1,13,13,29902,505,263,1108,411,590,6601,29889,739,29915,29879,263,360,514,512,1028,381,265,29871,29896,29945,29871,29945,29900,29900,29900,3652,19022,322,372,29915,29879,1048,29871,29906,2440,2030,29889,450,1108,338,393,278,12247,1250,4366,338,451,1985,29889,306,505,1898,278,9670,18835,1251,11427,6576,763,10092,1259,278,12247,11785,2264,29892,8454,278,12247,6055,29892,322,1584,15270,278,12247,7156,29892,541,3078,2444,304,664,29889,13,13,29902,505,884,1898,773,263,1422,12247,3578,7047,29892,541,393,2086,1838,29915,29873,664,29889,306,505,884,7120,278,350,25925,6055,322,278,12247,6055,297,278,350,25925,526,731,304,376,2951,1642,306,505,884,1898,773,263,1422,3081,2752,29892,541,393,2086,1838,29915,29873,664,29889,13,13,29902,626,472,590,12309,29915,29879,1095,322,306,1016,29915,29873,1073,825,304,437,29889,3529,1371,592,29889,13,13,5634,13,13,18567,727,29991,8221,304,8293,393,596,19022,29915,29879,12247,1250,4366,338,451,1985,29889,16564,373,278,2472,366,29915,345,4944,29892,372,2444,763,263,12837,2228,29889,2266,526,777,7037,6851,366,1033,1018,29901,13,13,29896,29889,5399,278,12247,1826,2801,29901,8561,1854,278,12247,1826,2801,338,11592,368,18665,3192,297,322,451,23819,29889,960,372,29915,29879,23819,29892,1018,18665,3460,372,297,1449,29889,13,29906,29889,5399,278,12247,8525,21387,29901,450,12247,8525,21387,338,14040,363,18750,5367,3081,322,848,304,278,12247,29889,960,278,8525,21387,338,5625,4063,470,23819,29892,372,508,4556,5626,411,278,12247,1250,4366,29889,5399,278,8525,21387,322,1207,1854,372,29915,29879,11592,368,6631,304,278,5637,3377,29889,13,29941,29889,5399,278,12247,1250,4366,11369,29901,450,12247,1250,4366,11369,338,5491,5982,373,278,5637,3377,29889,5399,278,11369,322,1207,1854,372,29915,29879,451,5625,4063,470,23819,29889,13,29946,29889,5399,363,19786,470,316,1182,275,29901,360,504,470,316,1182,275,508,18414,5987,373,278,12247,322,4556,5626,411,278,1250,4366,29889,4803,419,13120,4799,304,5941,714,738,19786,470,316,1182,275,515,278,12247,322,278,12247,1250,4366,11369,29889,13,29945,29889,5399,363,23819,470,5625,4063,6611,29901,960,263,1820,338,23819,470,5625,4063,29892,372,508,4556,5626,411,278,12247,1250,4366,29889,5399,1269,1820,322,1207,1854,896,29915,276,599,11592,368,10959,304,278,12247,29889,13,29953,29889,3967,263,1422,12247,29901,960,5642,310,278,2038,6851,664,29892,372,29915,29879,1950,393,278,12247,3528,338,5625,4063,29889,3967,773,263,1422,12247,304,1074,565,278,2228,338,11527,29889,13,13,3644,5642,310,1438,6851,664,29892,372,29915,29879,1950,393,278,2228,338,411,278,19022,29915,29879,12837,322,451,411,278,12247,470,7047,29889,512,445,1206,29892,366,1122,817,304,6958,360,514,2304,363,4340,18872,29889,13,13,29902,4966,445,6911,29991,2803,592,1073,565,366,505,738,916,5155,29889,2,1,917,29901,274,6552,281,7810,29892,848,29899,19672,29892,25209,10855,13,13,16492,29901,1128,304,7868,385,20215,7196,304,263,422,17801,297,23678,29973,13,13,29902,505,385,20215,7196,310,3618,393,306,864,304,7868,304,263,422,17801,297,23678,29889,306,505,1898,773,278,421,6913,4435,29952,2875,310,278,422,17801,29892,541,278,4333,338,451,1641,4784,746,278,3618,526,2715,470,6206,29889,13,13,10605,338,590,1060,23956,775,29901,13,13,29905,463,29912,401,29913,13,29966,26628,25085,4435,10724,9270,1619,27928,7196,29913,1013,13,1678,529,26628,29889,2001,6733,29958,13,4706,529,1469,6733,29958,13,9651,529,1626,7445,3992,10724,9270,4408,29913,4681,13,4706,1533,1469,6733,29958,13,1678,1533,26628,29889,2001,6733,29958,13,829,26628,29958,13,29905,355,29912,401,29913,13,13,2855,1244,338,590,315,29937,775,29901,13,13,29905,463,29912,401,29913,13,3597,20215,7196,29966,3421,2061,29958,1619,27928,7196,426,679,29936,731,29936,500,13,13,458,2023,13,13,3597,1780,3462,2061,29898,3421,2061,5446,29897,13,29912,13,1678,1619,27928,7196,29889,2528,29898,5415,416,13,29913,13,13,3597,1780,15154,2061,29898,3421,2061,5446,29897,13,29912,13,1678,1619,27928,7196,29889,15941,29898,5415,416,13,29913,13,29905,355,29912,401,29913,13,13,29902,505,884,1898,773,278,421,27928,7196,29952,297,278,421,1469,2677,29952,310,278,422,17801,29892,541,393,884,947,451,664,29889,13,13,6028,4856,3113,1371,592,4377,714,825,306,626,2599,2743,29973,13,13,22550,29901,887,817,304,731,278,421,6913,4435,29952,2875,310,278,421,26628,29952,304,278,421,27928,7196,29952,322,451,278,421,9270,1412,13,13,10605,338,278,24114,1060,23956,775,29901,13,13,29905,463,29912,401,29913,13,29966,26628,25085,4435,10724,9270,1619,27928,7196,29913,1013,13,1678,529,26628,29889,2001,6733,29958,13,4706,529,1469,6733,29958,13,9651,529,1626,7445,3992,10724,9270,4408,29913,4681,13,4706,1533,1469,6733,29958,13,1678,1533,26628,29889,2001,6733,29958,13,829,26628,29958,13,29905,355,29912,401,29913,13,13,17351,29892,1207,1854,393,596,421,3421,27928,7196,29952,338,6284,16601,322,393,366,526,13271,372,5149,29889,13],"output_log_probs":[0.0,...0.0],"sequence_length":1000}

Moreover, I'm able to query the infer endpoint successfully, like:

curl -i -X POST localhost:8000/v2/models/tensorrt_llm/infer -d \
'{"inputs": [{"name" : "input_ids", "shape" : [ 1, 2 ], "datatype" : "INT32", "data" : [1724,338] }, {"name" : "input_lengths", "shape" : [1, 1], "datatype" : "INT32", "data" : [2] }, {"name" : "request_output_len", "shape" : [1, 1], "datatype" : "INT32", "data" : [1000] }]}'

with response:

{"model_name":"tensorrt_llm","model_version":"1","outputs":[{"name":"output_ids","datatype":"INT32","shape":[1,1,1000],"data":[278,6437,310,278,11781,29973,1724,526,278,1820,5626,304,367,15648,29973,1724,526,278,7429,714,26807,470,1602,12112,304,367,1754,29973,13,13,29906,29889,13355,1598,278,1820,380,1296,8948,414,29901,11644,526,278,1820,2305,393,817,304,367,9701,297,278,11781,29973,1724,526,1009,16178,322,5544,747,9770,29973,1724,526,1009,20017,322,3736,1103,3145,29973,13,13,29941,29889,5953,837,457,278,11781,3402,29901,2811,278,11781,367,11595,470,1871,284,29973,2811,372,367,263,24329,29899,3293,11781,470,263,10679,29899,3293,11781,29973,1724,338,278,8210,3233,310,883,2877,322,3829,363,278,11781,29973,13,13,29946,29889,14542,852,263,13907,4423,29901,6804,674,278,11781,367,4934,29973,1317,278,4423,5948,15579,322,25561,363,599,472,841,311,267,29973,13,13,29945,29889,2661,370,1674,263,2821,946,8395,29901,1724,2702,23820,674,367,15648,2645,278,11781,29973,1724,526,278,7429,714,26807,470,1602,12112,304,367,1754,29973,1724,526,278,1820,3291,304,367,10664,29973,13,13,29953,29889,3789,263,931,4046,29901,1128,1472,674,278,11781,1833,29973,1724,338,278,8210,3309,310,931,363,278,11781,29973,13,13,29955,29889,13355,1598,738,5181,17279,29901,1724,17279,470,2472,674,367,4312,2645,278,11781,29973,2811,738,2198,800,470,1361,17718,367,4312,29973,13,13,29947,29889,14542,852,263,13907,931,29901,1724,338,278,1900,931,363,278,11781,29973,2811,599,472,841,311,267,367,3625,472,393,931,29973,13,13,29929,29889,2661,370,1674,263,2821,12084,3814,29901,1128,674,278,11781,367,18043,29973,2811,372,367,297,2022,29892,3025,4863,21362,29892,470,3025,9008,29973,1724,338,278,8210,12084,1158,363,278,11781,29973,13,13,29896,29900,29889,13355,1598,738,5181,1101,29899,786,8820,29901,1724,8820,817,304,367,4586,1156,278,11781,29973,11644,338,14040,363,5622,1438,8820,29973,1724,526,278,5335,24210,363,1438,8820,29973,13,13,2059,1494,1438,6576,29892,366,508,9801,393,596,5870,886,526,1532,29899,572,11310,29892,3234,573,29892,322,11828,29889,2,1,8778,1405,10130,1405,12157,719,4785,1575,1405,450,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,13,1576,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,13,1576,1494,338,385,13661,7314,515,2259,317,29889,438,433,600,29892,22760,11289,6673,322,9087,4978,12139,472,323,29902,6344,29899,22245,29943,29901,13,8015,19211,18987,338,263,4280,322,4100,1889,29892,322,18161,25228,943,1708,263,12187,6297,297,19912,1009,13154,6176,1009,3240,19211,14433,29889,13,2887,278,24354,1045,12392,12623,22170,3240,19211,5046,29892,278,817,363,6047,3240,19211,18987,756,2360,1063,901,24795,29889,7579,304,263,7786,18994,491,278,13377,2955,4649,19211,8907,29892,29871,29955,29945,29995,310,23035,505,451,7160,3307,363,3240,19211,29892,322,29871,29953,29900,29995,310,3240,533,267,19104,373,10307,14223,363,263,13638,310,1009,17869,29889,13,12881,273,1455,25228,943,508,1371,1009,13154,23624,278,4280,1907,310,3240,19211,18987,491,13138,7333,1891,9848,29892,27032,263,3464,310,13258,358,3987,29892,322,19912,13154,7952,373,5702,411,1009,1472,29899,8489,18161,14433,29889,13,10605,526,777,1820,5837,393,18161,25228,943,508,1371,1009,13154,411,3240,19211,18987,29901,13,29896,29889,4007,404,292,18161,1303,3335,29901,2087,1730,943,508,1371,13154,24809,1009,1857,18161,6434,29892,3704,1009,17869,29892,1518,11259,29892,21608,29892,322,2553,1372,29892,304,8161,565,896,526,373,5702,363,263,25561,3240,19211,29889,13,29906,29889,21605,1855,4695,14433,29901,2087,1730,943,508,1371,13154,731,1855,4695,3240,19211,14433,29892,1316,408,278,5046,896,864,304,3240,533,29892,1009,7429,301,7004,1508,297,3240,19211,29892,322,920,1568,896,817,304,4078,304,6176,1009,14433,29889,13,29941,29889,10682,292,263,3240,19211,17869,3814,29901,2087,1730,943,508,1371,13154,2693,263,15171,6270,3240,19211,17869,3814,393,7805,263,10296,310,8974,29892,1316,408,10307,14223,29892,282,5580,29892,3240,19211,15303,29892,322,916,13258,1860,29889,13,29946,29889,2315,6751,13258,358,12045,29901,2087,1730,943,508,1371,13154,10933,13258,358,12045,491,13138,263,3464,310,13258,358,3987,393,526,8210,363,1009,5046,29892,12045,20341,749,29892,322,3240,19211,14433,29889,13,29945,29889,9133,4821,373,17696,2304,29901,2087,1730,943,508,3867,373,17696,2304,322,27323,304,1371,13154,7952,373,5702,411,1009,3240,19211,14433,29892,322,1207,10365,1860,408,4312,29889,13,12881,273,1455,25228,943,1708,263,12187,6297,297,19912,1009,13154,6176,1009,3240,19211,14433,29892,322,491,13138,7333,1891,9848,29892,27032,263,3464,310,13258,358,3987,29892,322,19912,13154,7952,373,5702,411,1009,1472,29899,8489,18161,14433,29892,25228,943,508,1371,1009,13154,6176,263,25561,322,11592,3240,19211,29889,13,8176,3192,29901,2087,19188,29892,4231,273,1455,2087,19188,29892,4649,19211,29892,4649,19211,1858,9450,13,30009,450,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,2,1,8778,1405,10130,1405,12157,719,4785,1575,1405,450,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,13,1576,1528,280,310,278,2087,19188,297,4649,19211,1858,9450,13,1576,1494,338,385,13661,7314,515,2259]},{"name":"sequence_length","datatype":"INT32","shape":[1,1],"data":[1000]},{"name":"context_logits","datatype":"FP32","shape":[1,1,1],"data":[0.0]},{"name":"generation_logits","datatype":"FP32","shape":[1,1,1,1],"data":[0.0]},{"name":"output_log_probs","datatype":"FP32","shape":[1,1,1000],"data":[0.0,...,0.0]},{"name":"cum_log_probs","datatype":"FP32","shape":[1,1],"data":[0.0]}]}

I guess it's something simple and I'm querying the endpoint in a wrong way, but I really can't find a solution. Any help would be appreciated.

Expected behavior

generate endpoint returns correct results.
Error message is more meaningful.

actual behavior

generate endpoint throws an error.

additional notes

@bprus bprus added the bug Something isn't working label Mar 8, 2024
@PKaralupov
Copy link

@schetlur-nv I also have the same problem

@v-shobhit
Copy link

+1 on facing this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants