-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM #12069
[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM #12069
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Really appreciate your effort planned on this PR!
It would be great if you can share some design decisions for the these two items as an RFC (or two separate RFCs) first before we proceed with implementation. We (vLLM team) are also thinking about how we want to support multimodal output and streaming/realtime API on vLLM so it's probably the best time for us to discuss these items! |
Thank you for suggestion! I'll start these two RFCs tomorrow. |
@DarkLight1337 I think I might need some help for verifying LoRA support. Should I do any changes for it? |
@jeejeelee can help with this. Please keep in mind though that currently LoRA is only supported for the language part of multi-modal models. |
Signed-off-by: hzh <[email protected]>
Signed-off-by: hzh <[email protected]>
…tended design (vllm-project#11672) Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: hzh <[email protected]>
…ect#11921) Signed-off-by: shaochangxu.scx <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Signed-off-by: hzh <[email protected]>
…ject#11934) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: hzh <[email protected]>
…#11951) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: NickLucche <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: Roger Wang <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: hzh <[email protected]>
…roject#11100) Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: hzh <[email protected]>
Signed-off-by: [email protected] <[email protected]> Signed-off-by: hzh <[email protected]>
…m-project#9685) Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: hzh <[email protected]>
…project#11973) Signed-off-by: [email protected] <[email protected]> Signed-off-by: hzh <[email protected]>
Signed-off-by: hzh <[email protected]>
…project#11979) Signed-off-by: hzh <[email protected]>
Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: hzh <[email protected]>
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: DarkLight1337 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM and I have left a comment! I think this version is good to be merged!
This pull request has merge conflicts that must be resolved before it can be |
Finally get it merged, thanks for the help from @ywang96 @DarkLight1337 ! Then I 'll work on V1 support of MiniCPMV(O). |
…LM (vllm-project#12069) Signed-off-by: hzh <[email protected]> Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: shaochangxu.scx <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Yida Wu <[email protected]> Signed-off-by: Chenguang Li <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: Alex-Brooks <[email protected]> Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Shanshan Shen <[email protected]> Signed-off-by: elijah <[email protected]> Signed-off-by: Yikun <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: tjtanaa <[email protected]> Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: Rui Qiao <[email protected]> Co-authored-by: Sungjae Lee <[email protected]> Co-authored-by: shaochangxu <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: sixgod <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Rafael Vasquez <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Akshat Tripathi <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Avshalom Manevich <[email protected]> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Yangcheng Li <[email protected]> Co-authored-by: Siyuan Li <[email protected]> Co-authored-by: Concurrensee <[email protected]> Co-authored-by: Chenguang Li <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Alex Brooks <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Shanshan Shen <[email protected]> Co-authored-by: elijah <[email protected]> Co-authored-by: Yikun Jiang <[email protected]> Co-authored-by: Steve Luo <[email protected]> Co-authored-by: mgoin <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Konrad Zawora <[email protected]> Co-authored-by: TJian <[email protected]> Co-authored-by: tjtanaa <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Co-authored-by: maang-h <[email protected]> Co-authored-by: Elfie Guo <[email protected]> Co-authored-by: Rui Qiao <[email protected]> Co-authored-by: Roger Wang <[email protected]>
…LM (vllm-project#12069) Signed-off-by: hzh <[email protected]> Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: shaochangxu.scx <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Yida Wu <[email protected]> Signed-off-by: Chenguang Li <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: Alex-Brooks <[email protected]> Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Shanshan Shen <[email protected]> Signed-off-by: elijah <[email protected]> Signed-off-by: Yikun <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: tjtanaa <[email protected]> Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: Rui Qiao <[email protected]> Co-authored-by: Sungjae Lee <[email protected]> Co-authored-by: shaochangxu <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: sixgod <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Rafael Vasquez <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Akshat Tripathi <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Avshalom Manevich <[email protected]> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Yangcheng Li <[email protected]> Co-authored-by: Siyuan Li <[email protected]> Co-authored-by: Concurrensee <[email protected]> Co-authored-by: Chenguang Li <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Alex Brooks <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Shanshan Shen <[email protected]> Co-authored-by: elijah <[email protected]> Co-authored-by: Yikun Jiang <[email protected]> Co-authored-by: Steve Luo <[email protected]> Co-authored-by: mgoin <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Konrad Zawora <[email protected]> Co-authored-by: TJian <[email protected]> Co-authored-by: tjtanaa <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Co-authored-by: maang-h <[email protected]> Co-authored-by: Elfie Guo <[email protected]> Co-authored-by: Rui Qiao <[email protected]> Co-authored-by: Roger Wang <[email protected]> Signed-off-by: Isotr0py <[email protected]>
@HwwwwwwwH |
…LM (vllm-project#12069) Signed-off-by: hzh <[email protected]> Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: shaochangxu.scx <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Yida Wu <[email protected]> Signed-off-by: Chenguang Li <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: Alex-Brooks <[email protected]> Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Shanshan Shen <[email protected]> Signed-off-by: elijah <[email protected]> Signed-off-by: Yikun <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: tjtanaa <[email protected]> Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: Rui Qiao <[email protected]> Co-authored-by: Sungjae Lee <[email protected]> Co-authored-by: shaochangxu <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: sixgod <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Rafael Vasquez <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Akshat Tripathi <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Avshalom Manevich <[email protected]> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Yangcheng Li <[email protected]> Co-authored-by: Siyuan Li <[email protected]> Co-authored-by: Concurrensee <[email protected]> Co-authored-by: Chenguang Li <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Alex Brooks <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Shanshan Shen <[email protected]> Co-authored-by: elijah <[email protected]> Co-authored-by: Yikun Jiang <[email protected]> Co-authored-by: Steve Luo <[email protected]> Co-authored-by: mgoin <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Konrad Zawora <[email protected]> Co-authored-by: TJian <[email protected]> Co-authored-by: tjtanaa <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Co-authored-by: maang-h <[email protected]> Co-authored-by: Elfie Guo <[email protected]> Co-authored-by: Rui Qiao <[email protected]> Co-authored-by: Roger Wang <[email protected]>
…LM (vllm-project#12069) Signed-off-by: hzh <[email protected]> Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: shaochangxu.scx <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Yida Wu <[email protected]> Signed-off-by: Chenguang Li <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: Alex-Brooks <[email protected]> Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Shanshan Shen <[email protected]> Signed-off-by: elijah <[email protected]> Signed-off-by: Yikun <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: tjtanaa <[email protected]> Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: Rui Qiao <[email protected]> Co-authored-by: Sungjae Lee <[email protected]> Co-authored-by: shaochangxu <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: sixgod <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Rafael Vasquez <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Akshat Tripathi <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Avshalom Manevich <[email protected]> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Yangcheng Li <[email protected]> Co-authored-by: Siyuan Li <[email protected]> Co-authored-by: Concurrensee <[email protected]> Co-authored-by: Chenguang Li <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Alex Brooks <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Shanshan Shen <[email protected]> Co-authored-by: elijah <[email protected]> Co-authored-by: Yikun Jiang <[email protected]> Co-authored-by: Steve Luo <[email protected]> Co-authored-by: mgoin <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Konrad Zawora <[email protected]> Co-authored-by: TJian <[email protected]> Co-authored-by: tjtanaa <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Co-authored-by: maang-h <[email protected]> Co-authored-by: Elfie Guo <[email protected]> Co-authored-by: Rui Qiao <[email protected]> Co-authored-by: Roger Wang <[email protected]>
…LM (vllm-project#12069) Signed-off-by: hzh <[email protected]> Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: shaochangxu.scx <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Yida Wu <[email protected]> Signed-off-by: Chenguang Li <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: Alex-Brooks <[email protected]> Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Shanshan Shen <[email protected]> Signed-off-by: elijah <[email protected]> Signed-off-by: Yikun <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: tjtanaa <[email protected]> Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: Rui Qiao <[email protected]> Co-authored-by: Sungjae Lee <[email protected]> Co-authored-by: shaochangxu <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: sixgod <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Rafael Vasquez <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Akshat Tripathi <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Avshalom Manevich <[email protected]> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Yangcheng Li <[email protected]> Co-authored-by: Siyuan Li <[email protected]> Co-authored-by: Concurrensee <[email protected]> Co-authored-by: Chenguang Li <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Alex Brooks <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Shanshan Shen <[email protected]> Co-authored-by: elijah <[email protected]> Co-authored-by: Yikun Jiang <[email protected]> Co-authored-by: Steve Luo <[email protected]> Co-authored-by: mgoin <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Konrad Zawora <[email protected]> Co-authored-by: TJian <[email protected]> Co-authored-by: tjtanaa <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Co-authored-by: maang-h <[email protected]> Co-authored-by: Elfie Guo <[email protected]> Co-authored-by: Rui Qiao <[email protected]> Co-authored-by: Roger Wang <[email protected]>
The adaptation is timely, but the lack of support for streaming multimodal inputs currently results in slower vision inference speeds. |
Here is the transformer implementation https://github.com/thanhnienyeumeo/minicpm-o |
…LM (vllm-project#12069) Signed-off-by: hzh <[email protected]> Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: shaochangxu.scx <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Yida Wu <[email protected]> Signed-off-by: Chenguang Li <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: Alex-Brooks <[email protected]> Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Shanshan Shen <[email protected]> Signed-off-by: elijah <[email protected]> Signed-off-by: Yikun <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: tjtanaa <[email protected]> Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: Rui Qiao <[email protected]> Co-authored-by: Sungjae Lee <[email protected]> Co-authored-by: shaochangxu <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: sixgod <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Rafael Vasquez <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Akshat Tripathi <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Avshalom Manevich <[email protected]> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Yangcheng Li <[email protected]> Co-authored-by: Siyuan Li <[email protected]> Co-authored-by: Concurrensee <[email protected]> Co-authored-by: Chenguang Li <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Alex Brooks <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Shanshan Shen <[email protected]> Co-authored-by: elijah <[email protected]> Co-authored-by: Yikun Jiang <[email protected]> Co-authored-by: Steve Luo <[email protected]> Co-authored-by: mgoin <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Konrad Zawora <[email protected]> Co-authored-by: TJian <[email protected]> Co-authored-by: tjtanaa <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Co-authored-by: maang-h <[email protected]> Co-authored-by: Elfie Guo <[email protected]> Co-authored-by: Rui Qiao <[email protected]> Co-authored-by: Roger Wang <[email protected]>
How is VLLM omni-mode support is implemented, multimodal support is a crucial part of this model. With transformers below part implements it.
|
You need to split and concatenate video frames and audio chunks by your self and use prompt like: [(<video>./</video>)(<audio>./</audio>)] * length |
@HwwwwwwwH Great, there will be a corresponding multimodal data containing multiple images and audios (1sec each). I got it working for a single image, audio pair via:
where |
same question |
…LM (vllm-project#12069) Signed-off-by: hzh <[email protected]> Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: shaochangxu.scx <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Yida Wu <[email protected]> Signed-off-by: Chenguang Li <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: Alex-Brooks <[email protected]> Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Shanshan Shen <[email protected]> Signed-off-by: elijah <[email protected]> Signed-off-by: Yikun <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: tjtanaa <[email protected]> Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: Rui Qiao <[email protected]> Co-authored-by: Sungjae Lee <[email protected]> Co-authored-by: shaochangxu <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: sixgod <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Rafael Vasquez <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: Akshat Tripathi <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Avshalom Manevich <[email protected]> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Yangcheng Li <[email protected]> Co-authored-by: Siyuan Li <[email protected]> Co-authored-by: Concurrensee <[email protected]> Co-authored-by: Chenguang Li <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Alex Brooks <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Shanshan Shen <[email protected]> Co-authored-by: elijah <[email protected]> Co-authored-by: Yikun Jiang <[email protected]> Co-authored-by: Steve Luo <[email protected]> Co-authored-by: mgoin <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]> Co-authored-by: Konrad Zawora <[email protected]> Co-authored-by: TJian <[email protected]> Co-authored-by: tjtanaa <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Co-authored-by: maang-h <[email protected]> Co-authored-by: Elfie Guo <[email protected]> Co-authored-by: Rui Qiao <[email protected]> Co-authored-by: Roger Wang <[email protected]>
This PR aims to adapt and support all the features of MiniCPM-V and MiniCPM-o. It is designed to be compatible with various modalities (image, video, audio), different model versions (2.0, 2.5, 2.6, o), and diverse input types (raw, embeddings), while maintaining support for LORA, which might require significant effort.
Below is the roadmap for this PR:
MultiModalInputsV2
of vLLM.This PR is still in development. Once I complete the support for audio, I will request to merge. I'll get this work done ASAP.
FIX #12162