Error loading model #9

mateenmalik · 2023-05-22T10:52:13Z

Hi,

I get the following error when I try to load the model:

python3.10/site-packages/ctransformers/lib/basic/libctransformers.so: cannot open shared object file: No such file or directory

using:
llm = AutoModelForCausalLM.from_pretrained('/models/gpt-2-1558M-ggml/ggml-model-q4_1.bin', model_type='gpt2', lib='basic')

I am running this on aarch64 Ubuntu 22.04 system

Please let me know how to fix this.

Thank you.

marella · 2023-05-22T12:32:55Z

Hi,

Precompiled libs are not available for ARM processors. If you remove lib='basic', you should get an error "The current platform is not supported."

Can you please run the following command and let me know its output:

python3 -c 'import platform; print(platform.processor())'

You can build the library from source to make it work:

git clone --recurse-submodules https://github.com/marella/ctransformers
cd ctransformers
./scripts/build.sh

The compiled binary will be located at build/lib/libctransformers.so which can be used as:

llm = AutoModelForCausalLM.from_pretrained(..., lib='/path/to/ctransformers/build/lib/libctransformers.so')

Please let me know if this works.

mateenmalik · 2023-05-23T11:11:07Z

Hi Marella,

Thanks for the reply.

I installed from source as you advised.
It works :)

Thanks again for the good work you are doing and helping people with AI and LLM's. Kudos to you :).

xdevfaheem · 2023-05-23T17:49:23Z

Hey @marella You are Awesome Man! I mean really... I Tried So Many FOSS, llama.cpp, gpt4all library, rwkv.cpp ... nothing gave me the infrence speed and low ram usage which ctransformers gave. i wonder what may be the reason to this?

please don't stop improving this!

mateenmalik · 2023-05-24T09:24:58Z

Hi Marella,

Just wondering if ctransformers can be used with Nvidia Triton Inference Server (https://developer.nvidia.com/nvidia-triton-inference-server) for inference using both CPU and GPU,
as triton will take care of batching, concurrent model execution, support for GPU & CPU, it will maximizes performance and reduces end-to-end latency.

I guess a custom backend has to be created for this https://github.com/triton-inference-server/backend/blob/main/examples/README.md

Example of minimal backend (for your quick reference):
https://github.com/triton-inference-server/backend/blob/main/examples/backends/minimal/src/minimal.cc

I believe an enterprise level inference server which can support ggml models and langchain will benefit the community, enterprise and the environment.

Looking forward for your views on this.

Thank you.

marella · 2023-05-24T19:12:19Z

Thanks again for the good work you are doing and helping people with AI and LLM's. Kudos to you :).

Thanks @mateenmalik, I will add the build instructions to README and will see if I can further simplify this.

I believe an enterprise level inference server which can support ggml models and langchain will benefit the community, enterprise and the environment.

Currently the GGML library is not ready for production use and changes rapidly, so integrating it with a enterprise level server might not be feasible. Also it's main goal is to run large models on CPU on consumer hardware without the need for a GPU. However this may change over time as the GGML library evolves.

Hey @marella You are Awesome Man! I mean really... I Tried So Many FOSS, llama.cpp, gpt4all library, rwkv.cpp ... nothing gave me the infrence speed and low ram usage which ctransformers gave. i wonder what may be the reason to this?

Thanks @TheFaheem, one reason for performance could be that most libraries don't enable AVX2 by default but this library enables it by default and asks users to switch to 'avx' or 'basic' versions if AVX2 doesn't work.

Ajayvenki · 2024-07-25T16:55:27Z

For some reason this solution did not work. I tried all the steps mentioned by @marella
It just generated the .dylib file and i could not find .so after the build.sh execution. Screen shot attached.

(venv) (base) ajayvenkatesan@Ajays-Air sample % python3 -c 'import platform; print(platform.processor())'
arm

Is there anything i could do with this to solve ?

marella closed this as completed May 24, 2023

marella mentioned this issue May 25, 2023

basic lib doesn't work with Ampere ARM processors #11

Closed

This was referenced Jul 26, 2024

OSError: .......cannot open shared object file: No such file or directory #209

Open

Problem accessing libctransformers.so #211

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error loading model #9

Error loading model #9

mateenmalik commented May 22, 2023

marella commented May 22, 2023

mateenmalik commented May 23, 2023

xdevfaheem commented May 23, 2023 •

edited

Loading

mateenmalik commented May 24, 2023 •

edited

Loading

marella commented May 24, 2023

Ajayvenki commented Jul 25, 2024

Error loading model #9

Error loading model #9

Comments

mateenmalik commented May 22, 2023

marella commented May 22, 2023

mateenmalik commented May 23, 2023

xdevfaheem commented May 23, 2023 • edited Loading

mateenmalik commented May 24, 2023 • edited Loading

marella commented May 24, 2023

Ajayvenki commented Jul 25, 2024

xdevfaheem commented May 23, 2023 •

edited

Loading

mateenmalik commented May 24, 2023 •

edited

Loading