Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

GPT-2 segfaults when used through the CLI #167

Closed
philpax opened this issue May 1, 2023 · 8 comments · Fixed by #260
Closed

GPT-2 segfaults when used through the CLI #167

philpax opened this issue May 1, 2023 · 8 comments · Fixed by #260
Labels
app:cli App: the `llm` CLI issue:bug Something isn't working

Comments

@philpax
Copy link
Collaborator

philpax commented May 1, 2023

Trying any GPT-2 GGML model through the CLI appears to cause an immediate segfault:

llama-rs # cargo run --bin llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_0.bin -p "Now, this is a story all about how"

[...]
[2023-05-01T23:43:17Z INFO  llm::cli_args] Model fully loaded! Elapsed: 75ms
zsh: segmentation fault  cargo run --bin llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_0.bin -p 

This appears to be true regardless of the model (Cerebras and base GPT-2 seem to both suffer from this).

This doesn't happen when run through the GPT-2 example.

@philpax philpax added issue:bug Something isn't working app:cli App: the `llm` CLI labels May 1, 2023
@danforbes
Copy link
Contributor

I wonder if this has to do w/ loading through the snapshot.

@danforbes
Copy link
Contributor

I am not able to reproduce this problem

llama-rs: ./target/release/llm gpt2 infer -m ~/.ggml-models/cerebras-gpt-13b.bin -p "Hello my name is"
[2023-05-03T17:55:57Z INFO  llm::cli_args] ggml ctx size = 7857.04 MB
    
[2023-05-03T17:55:57Z INFO  llm::cli_args] Loaded tensor 8/485
...
[2023-05-03T17:56:02Z INFO  llm::cli_args] Loaded tensor 480/485
[2023-05-03T17:56:02Z INFO  llm::cli_args] Loading of model complete
[2023-05-03T17:56:02Z INFO  llm::cli_args] Model size = 0.00 MB / num tensors = 485
[2023-05-03T17:56:02Z INFO  llm::cli_args] Model fully loaded! Elapsed: 5008ms
"Hello my name is 'Celest,' and you're looking for a guy named..." "Marius." ""I'm looking for Marius^C

@philpax
Copy link
Collaborator Author

philpax commented May 3, 2023

How weird... is that q4 or f16?

@danforbes
Copy link
Contributor

q4? I'm not sure honestly 😅 I think I'm testing w/ this model that appears to have been taken down 🤷🏻 https://huggingface.co/mongolian-basket-weaving/cerebras-gpt-13b-ggml-q4_0

@danforbes
Copy link
Contributor

@danforbes danforbes mentioned this issue May 6, 2023
4 tasks
@philpax
Copy link
Collaborator Author

philpax commented May 6, 2023

Ok, just tested with https://huggingface.co/xzuyn/GPT-2-124M-ggml-q4_1/blob/main/ggml-model-q4_1.bin on macOS:

# cargo run --bin llm gpt2 infer -m models/gpt2/GPT-2-124M-ggml-q4_1.bin -p "1 + 2 = "  
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/llm gpt2 infer -m models/gpt2/GPT-2-124M-ggml-q4_1.bin -p '1 + 2 = '`
✓ Loaded 149 tensors (125.8 MB) after 153ms
zsh: segmentation fault  cargo run --bin llm gpt2 infer -m models/gpt2/GPT-2-124M-ggml-q4_1.bin -p 

@philpax
Copy link
Collaborator Author

philpax commented May 6, 2023

Is this wrong?

https://github.com/rustformers/llm/blob/be56c36/crates/models/gpt2/src/lib.rs#L314-L316

Aha - I think you've figured it out...

Running with --num-ctx-tokens 1024 doesn't segfault for me. Our default of 2048 doesn't work for all models. Oops.


Or maybe not.

# cargo run --release --bin llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_1.bin -p "Fred looked at his hand and wondered: "  --num-ctx-tokens 512 
    Finished release [optimized] target(s) in 0.08s
     Running `target/release/llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_1.bin -p 'Fred looked at his hand and wondered: ' --num-ctx-tokens 512`
✓ Loaded 389 tensors (5.6 GB) after 91ms
zsh: segmentation fault  cargo run --release --bin llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_1.bi

@philpax
Copy link
Collaborator Author

philpax commented May 6, 2023

Quick findings with a debugger:

This is definitely something we should investigate and fix, but not a showstopper for now, I think.

danforbes added a commit to danforbes/llm that referenced this issue May 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
app:cli App: the `llm` CLI issue:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants