-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimal changes to get Mistral-7B-Instruct-v0.1 working #986
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks for your contribution!
I heard that this model uses a new kind of attention (sliding window stuff). And from huggingface/transformers#26447 this model doesn't seem to share the same architecture as llama. So I wonder why using Mistral weights with llama model is supposed to work? |
IIUC, SWA is only needed for extrapolating to sequences longer than 4096 tokens. Otherwise, the arch is the same, except that Mistral-7B uses GQA, while the Llama 2 family only uses GQA for the 70B model (Llama 2 7B and 13B use vanilla MHA). GQA support for Llama 2 models added in #567 handles that difference transparently. |
I see, I guess that's what you meant by "Minimal changes". Indeed looking at huggingface/transformers#26447 more closely, their 900 lines |
Thanks for linking to the huggingface transformers PR. If it's as simple as removing causal masking, I'll take a stab at it over the weekend. |
I can successfully build it on a Mac studio. In the last step "mlc_chat_cli --model Mistral-7B-Instruct-v0.1-q4f16_1", I got messages
But as I typed any prompt, the process would be terminated with following errors,
Any idea what is going wrong here? Thanks |
Hi @ZebinYang, thanks for reporting the issue! Just tried Mistral Instruct on a Mac Studio and it worked fine. The problem you are seeing is probably due to #1087, where we updated |
Hi @CharlieFRuan , Thanks for your response.
However, I have some further questions.
![]() And it does not work even I specify the full path of the compiled model.
|
Hi @ZebinYang, for question 1, there are two relevant things:
For question 2, did you go with updating the latest nightly, or pulling the latest repo and build from source? |
Thanks, it worked as I added "params" in the path.
|
To build the model on macOS: