qwen2 vl gdq量化失败，报错 ERROR: Catch exception when Optimizing model: 'input' #159

lijianxing123 · 2025-01-25T10:31:35Z

按照教程构建input.json 数据集，进行gdq 量化，报错，
INFO: PyTorch version 2.4.0 available. INFO: rkllm-toolkit version: 1.1.4 The argument trust_remote_codeis to be used with Auto classes. It has no effect here and is ignored.Qwen2VLRotaryEmbeddingcan now be fully parameterized by passing the model config through theconfig argument. All other arguments will be removed in v4.46 Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.55it/s] WARNING: rkllm-toolkit only exports the language model of Qwen2VL! Building model: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 399/399 [00:10<00:00, 37.93it/s] ERROR: Catch exception when Optimizing model: 'input' Build model failed!

以下为量化代码
`import os
from rkllm.api import RKLLM
from datasets import load_dataset
from transformers import AutoTokenizer
from tqdm import tqdm
import torch
from torch import nn

modelpath = "../../../Qwen2-VL-2B-Instruct"
savepath = './Qwen2-VL-2B-Instruct.rkllm'
llm = RKLLM()

ret = llm.load_huggingface(model=modelpath, device='cuda')
if ret != 0:
print('Load model failed!')
exit(ret)

dataset = 'data/inputs.json'

qparams = None
ret = llm.build(do_quantization=True, optimization_level=1, quantized_dtype='w4a16_g64',
quantized_algorithm='gdq', target_platform='rk3576', num_npu_core=2, extra_qparams=qparams, dataset=dataset)

if ret != 0:
print('Build model failed!')
exit(ret)

ret = llm.export_rkllm(savepath)
if ret != 0:
print('Export model failed!')
exit(ret)`

The text was updated successfully, but these errors were encountered:

waydong · 2025-02-02T01:02:20Z

Hi, 可以把quantized_algorithm改成normal试试。

lijianxing123 · 2025-02-10T09:25:05Z

请问有啥区别吗，请问gdq的原理是啥，是不是gdq的量化效果更好呢，感谢

waydong · 2025-02-10T09:36:51Z

请问有啥区别吗，请问gdq的原理是啥，是不是gdq的量化效果更好呢，感谢

gdq会对量化参数进行训练微调，精度会更高。

lijianxing123 · 2025-02-10T09:37:39Z

那为啥现在用不了

lijianxing123 · 2025-02-10T09:37:50Z

有bug吗

waydong · 2025-02-10T09:43:29Z

那为啥现在用不了

多模态模型量化数据集用的input_embed，gdq目前还不支持，可以先用normal跑。

lijianxing123 · 2025-02-10T10:05:53Z

预计啥时候能支持呢

wohaiaini · 2025-02-25T02:34:49Z

请问下，还有没有其他哪些量化算法，能提高响应速度的？

waydong · 2025-02-25T03:28:05Z

请问下，还有没有其他哪些量化算法，能提高响应速度的？
定频可以提升性能，定频脚本：https://github.com/airockchip/rknn-llm/tree/main/scripts

wohaiaini · 2025-02-25T06:26:13Z

请问下，还有没有其他哪些量化算法，能提高响应速度的？
定频可以提升性能，定频脚本：https://github.com/airockchip/rknn-llm/tree/main/scripts

非常感谢！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwen2 vl gdq量化失败，报错 ERROR: Catch exception when Optimizing model: 'input' #159

qwen2 vl gdq量化失败，报错 ERROR: Catch exception when Optimizing model: 'input' #159

lijianxing123 commented Jan 25, 2025 •

edited

Loading

waydong commented Feb 2, 2025

lijianxing123 commented Feb 10, 2025

waydong commented Feb 10, 2025

lijianxing123 commented Feb 10, 2025

lijianxing123 commented Feb 10, 2025

waydong commented Feb 10, 2025

lijianxing123 commented Feb 10, 2025

wohaiaini commented Feb 25, 2025

waydong commented Feb 25, 2025

wohaiaini commented Feb 25, 2025

qwen2 vl gdq量化失败，报错 ERROR: Catch exception when Optimizing model: 'input' #159

qwen2 vl gdq量化失败，报错 ERROR: Catch exception when Optimizing model: 'input' #159

Comments

lijianxing123 commented Jan 25, 2025 • edited Loading

waydong commented Feb 2, 2025

lijianxing123 commented Feb 10, 2025

waydong commented Feb 10, 2025

lijianxing123 commented Feb 10, 2025

lijianxing123 commented Feb 10, 2025

waydong commented Feb 10, 2025

lijianxing123 commented Feb 10, 2025

wohaiaini commented Feb 25, 2025

waydong commented Feb 25, 2025

wohaiaini commented Feb 25, 2025

lijianxing123 commented Jan 25, 2025 •

edited

Loading