Minimal changes to get Mistral-7B-Instruct-v0.1 working #986

jeethu · 2023-09-27T18:36:42Z

To build the model on macOS:

git clone https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
# in the mlc-llm dir, rebuild mlc_chat_cli and then:
python build.py --model <path_to_model_checkout> --quantization q4f16_1 --target metal
./build/mlc_chat_cli --model Mistral-7B-Instruct-v0.1-q4f16_1

junrushao

This is great, thanks for your contribution!

masahi · 2023-09-28T10:34:06Z

I heard that this model uses a new kind of attention (sliding window stuff). And from huggingface/transformers#26447 this model doesn't seem to share the same architecture as llama. So I wonder why using Mistral weights with llama model is supposed to work?

jeethu · 2023-09-28T12:41:34Z

So I wonder why using Mistral weights with llama model is supposed to work?

IIUC, SWA is only needed for extrapolating to sequences longer than 4096 tokens. Otherwise, the arch is the same, except that Mistral-7B uses GQA, while the Llama 2 family only uses GQA for the 70B model (Llama 2 7B and 13B use vanilla MHA). GQA support for Llama 2 models added in #567 handles that difference transparently.

masahi · 2023-09-28T17:44:55Z

I see, I guess that's what you meant by "Minimal changes". Indeed looking at huggingface/transformers#26447 more closely, their 900 lines modeling_mistral.py is an exact copy of modeling_llama.py except for the causal mask creation 🤦‍♂️

jeethu · 2023-09-29T16:18:22Z

Thanks for linking to the huggingface transformers PR. If it's as simple as removing causal masking, I'll take a stab at it over the weekend.

ZebinYang · 2023-10-30T06:36:11Z

I can successfully build it on a Mac studio. In the last step "mlc_chat_cli --model Mistral-7B-Instruct-v0.1-q4f16_1", I got messages

Loading model...
Loading finished
Running system prompts...
System prompts finished
[INST]:

But as I typed any prompt, the process would be terminated with following errors,

[02:29:57] /Users/catalyst/Workspace/miniforge3/envs/mlc-llm-build/conda-bld/mlc-chat-cli-nightly-package_1698048403779/work/3rdparty/tvm/src/runtime/relax_vm/pooled_allocator.h:64: Warning: PooledAllocator got InternalError during allocation: InternalError: Check failed: (buf != nil) is false: 
[02:29:57] /Users/catalyst/Workspace/miniforge3/envs/mlc-llm-build/conda-bld/mlc-chat-cli-nightly-package_1698048403779/work/3rdparty/tvm/src/runtime/relax_vm/pooled_allocator.h:65: Warning: Trying to release all unused memory and reallocate...
libc++abi: terminating due to uncaught exception of type tvm::runtime::InternalError: [02:29:57] /Users/catalyst/Workspace/miniforge3/envs/mlc-llm-build/conda-bld/mlc-chat-cli-nightly-package_1698048403779/work/3rdparty/tvm/include/tvm/runtime/packed_func.h:1307: unknown type = 0
Stack trace:
  [bt] (0) 1   libtvm_runtime.dylib                0x0000000102ac595c tvm::runtime::detail::LogFatal::Entry::Finalize() + 68
  [bt] (1) 2   libtvm_runtime.dylib                0x0000000102ac5918 tvm::runtime::detail::LogFatal::Entry::Finalize() + 0
  [bt] (2) 3   libtvm_runtime.dylib                0x0000000102abfc20 __clang_call_terminate + 0
  [bt] (3) 4   libtvm_runtime.dylib                0x0000000102b8f48c tvm::runtime::relax_vm::MemoryManager::GetAllocator(DLDevice) + 640
  [bt] (4) 5   libtvm_runtime.dylib                0x0000000102b6cf34 tvm::runtime::SimpleObjAllocator::Handler<tvm::runtime::relax_vm::StorageObj>::Deleter_(tvm::runtime::Object*) + 28
  [bt] (5) 6   libtvm_runtime.dylib                0x0000000102b68f70 tvm::runtime::relax_vm::VMAllocStorage(void*, tvm::runtime::ShapeTuple, long long, DLDataType, tvm::runtime::String) + 980
  [bt] (6) 7   libtvm_runtime.dylib                0x0000000102b6dab0 void tvm::runtime::TypedPackedFunc<tvm::runtime::relax_vm::Storage (void*, tvm::runtime::ShapeTuple, long long, DLDataType, tvm::runtime::String)>::AssignTypedLambda<tvm::runtime::relax_vm::Storage (*)(void*, tvm::runtime::ShapeTuple, long long, DLDataType, tvm::runtime::String)>(tvm::runtime::relax_vm::Storage (*)(void*, tvm::runtime::ShapeTuple, long long, DLDataType, tvm::runtime::String), std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>)::'lambda'(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)::operator()(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*) const + 284
  [bt] (7) 8   libtvm_runtime.dylib                0x0000000102ba5614 tvm::runtime::relax_vm::VirtualMachineImpl::InvokeClosurePacked(tvm::runtime::ObjectRef const&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) + 96
  [bt] (8) 9   libtvm_runtime.dylib                0x0000000102ba7584 tvm::runtime::relax_vm::VirtualMachineImpl::RunInstrCall(tvm::runtime::relax_vm::VMFrame*, tvm::runtime::relax_vm::Instruction) + 1504

Any idea what is going wrong here? Thanks

CharlieFRuan · 2023-11-07T00:39:13Z

Hi @ZebinYang, thanks for reporting the issue! Just tried Mistral Instruct on a Mac Studio and it worked fine. The problem you are seeing is probably due to #1087, where we updated llm_chat.cc. Therefore, you would need to either install the latest nightly, or build from source using the latest repo.

ZebinYang · 2023-11-07T03:17:32Z

Hi @CharlieFRuan ,

Thanks for your response.
I am able to run the following scripts as I upgraded all the codes.

import mlc_chat
cm = mlc_chat.ChatModule(model='Mistral-7B-Instruct-v0.1-q4f16_1')
cm.generate('hi')

However, I have some further questions.

As I switch the working directory to another folder (other than the mlc-llm), the script would raise an error,

And it does not work even I specify the full path of the compiled model.

The cli command still does not work.

mlc_chat_cli --model Mistral-7B-Instruct-v0.1-q4f16_1

with error messages

CharlieFRuan · 2023-11-07T03:54:10Z

Hi @ZebinYang, for question 1, there are two relevant things:

If you compiled it locally, to pass the full path, you would need to supply the params folder, which might be a bit counter-intuitive.. (but we are mainly looking for the mlc-chat-config.json, which resides in params)
- e.g. cm = mlc_chat.ChatModule(model='/full/path/to/mlc-llm/dist/Mistral-7B-Instruct-v0.1-q4f16_1/params')
There is also an argument called model_lib_path which allows you to use another model library file

For question 2, did you go with updating the latest nightly, or pulling the latest repo and build from source?

ZebinYang · 2023-11-07T04:20:22Z

Hi @CharlieFRuan

Thanks, it worked as I added "params" in the path.
I simply updated the dependencies using

pip install --pre --force-reinstall mlc-ai-nightly mlc-chat-nightly -f https://mlc.ai/wheels

Minimal changes to get Mistral-7B-Instruct-v0.1 working

9cf262a

junrushao approved these changes Sep 28, 2023

View reviewed changes

junrushao merged commit 6598cec into mlc-ai:main Sep 28, 2023

masahi mentioned this pull request Sep 28, 2023

[Model Request] Mistral-7b #991

Closed

jeethu deleted the jeethu/mistral branch September 28, 2023 12:40

davidpissarra mentioned this pull request Oct 2, 2023

[Tracking] Sliding Window Attention (Mistral AI) #1003

Closed

5 tasks

Lurrobert mentioned this pull request Nov 3, 2023

[Bug] Out of memory error with mistral 7B #1186

Closed

Kartik14 mentioned this pull request Nov 4, 2023

[Bug] Mistral-7B-Instruct-v0.1 not working with python API #1195

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimal changes to get Mistral-7B-Instruct-v0.1 working #986

Minimal changes to get Mistral-7B-Instruct-v0.1 working #986

jeethu commented Sep 27, 2023 •

edited

Loading

junrushao left a comment

masahi commented Sep 28, 2023

jeethu commented Sep 28, 2023

masahi commented Sep 28, 2023

jeethu commented Sep 29, 2023

ZebinYang commented Oct 30, 2023

CharlieFRuan commented Nov 7, 2023

ZebinYang commented Nov 7, 2023

CharlieFRuan commented Nov 7, 2023

ZebinYang commented Nov 7, 2023

Minimal changes to get Mistral-7B-Instruct-v0.1 working #986

Minimal changes to get Mistral-7B-Instruct-v0.1 working #986

Conversation

jeethu commented Sep 27, 2023 • edited Loading

junrushao left a comment

Choose a reason for hiding this comment

masahi commented Sep 28, 2023

jeethu commented Sep 28, 2023

masahi commented Sep 28, 2023

jeethu commented Sep 29, 2023

ZebinYang commented Oct 30, 2023

CharlieFRuan commented Nov 7, 2023

ZebinYang commented Nov 7, 2023

CharlieFRuan commented Nov 7, 2023

ZebinYang commented Nov 7, 2023

jeethu commented Sep 27, 2023 •

edited

Loading