-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error loading model #9
Comments
Hi, Precompiled libs are not available for ARM processors. If you remove Can you please run the following command and let me know its output: python3 -c 'import platform; print(platform.processor())' You can build the library from source to make it work: git clone --recurse-submodules https://github.com/marella/ctransformers
cd ctransformers
./scripts/build.sh The compiled binary will be located at llm = AutoModelForCausalLM.from_pretrained(..., lib='/path/to/ctransformers/build/lib/libctransformers.so') Please let me know if this works. |
Hi Marella, Thanks for the reply. I installed from source as you advised. Thanks again for the good work you are doing and helping people with AI and LLM's. Kudos to you :). |
Hey @marella You are Awesome Man! I mean really... I Tried So Many FOSS, llama.cpp, gpt4all library, rwkv.cpp ... nothing gave me the infrence speed and low ram usage which ctransformers gave. i wonder what may be the reason to this? please don't stop improving this! |
Hi Marella, Just wondering if ctransformers can be used with Nvidia Triton Inference Server (https://developer.nvidia.com/nvidia-triton-inference-server) for inference using both CPU and GPU, I guess a custom backend has to be created for this https://github.com/triton-inference-server/backend/blob/main/examples/README.md Example of minimal backend (for your quick reference): I believe an enterprise level inference server which can support ggml models and langchain will benefit the community, enterprise and the environment. Looking forward for your views on this. Thank you. |
Thanks @mateenmalik, I will add the build instructions to README and will see if I can further simplify this.
Currently the GGML library is not ready for production use and changes rapidly, so integrating it with a enterprise level server might not be feasible. Also it's main goal is to run large models on CPU on consumer hardware without the need for a GPU. However this may change over time as the GGML library evolves.
Thanks @TheFaheem, one reason for performance could be that most libraries don't enable AVX2 by default but this library enables it by default and asks users to switch to 'avx' or 'basic' versions if AVX2 doesn't work. |
For some reason this solution did not work. I tried all the steps mentioned by @marella (venv) (base) ajayvenkatesan@Ajays-Air sample % python3 -c 'import platform; print(platform.processor())' Is there anything i could do with this to solve ? |
Hi,
I get the following error when I try to load the model:
python3.10/site-packages/ctransformers/lib/basic/libctransformers.so: cannot open shared object file: No such file or directory
using:
llm = AutoModelForCausalLM.from_pretrained('/models/gpt-2-1558M-ggml/ggml-model-q4_1.bin', model_type='gpt2', lib='basic')
I am running this on aarch64 Ubuntu 22.04 system
Please let me know how to fix this.
Thank you.
The text was updated successfully, but these errors were encountered: