GPT-2 segfaults when used through the CLI #167

philpax · 2023-05-01T23:44:20Z

Trying any GPT-2 GGML model through the CLI appears to cause an immediate segfault:

llama-rs # cargo run --bin llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_0.bin -p "Now, this is a story all about how"

[...]
[2023-05-01T23:43:17Z INFO  llm::cli_args] Model fully loaded! Elapsed: 75ms
zsh: segmentation fault  cargo run --bin llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_0.bin -p

This appears to be true regardless of the model (Cerebras and base GPT-2 seem to both suffer from this).

This doesn't happen when run through the GPT-2 example.

The text was updated successfully, but these errors were encountered:

danforbes · 2023-05-02T22:03:12Z

I wonder if this has to do w/ loading through the snapshot.

danforbes · 2023-05-03T17:57:53Z

I am not able to reproduce this problem

llama-rs: ./target/release/llm gpt2 infer -m ~/.ggml-models/cerebras-gpt-13b.bin -p "Hello my name is"
[2023-05-03T17:55:57Z INFO  llm::cli_args] ggml ctx size = 7857.04 MB
    
[2023-05-03T17:55:57Z INFO  llm::cli_args] Loaded tensor 8/485
...
[2023-05-03T17:56:02Z INFO  llm::cli_args] Loaded tensor 480/485
[2023-05-03T17:56:02Z INFO  llm::cli_args] Loading of model complete
[2023-05-03T17:56:02Z INFO  llm::cli_args] Model size = 0.00 MB / num tensors = 485
[2023-05-03T17:56:02Z INFO  llm::cli_args] Model fully loaded! Elapsed: 5008ms
"Hello my name is 'Celest,' and you're looking for a guy named..." "Marius." ""I'm looking for Marius^C

philpax · 2023-05-03T18:10:17Z

How weird... is that q4 or f16?

danforbes · 2023-05-03T18:15:19Z

q4? I'm not sure honestly 😅 I think I'm testing w/ this model that appears to have been taken down 🤷🏻 https://huggingface.co/mongolian-basket-weaving/cerebras-gpt-13b-ggml-q4_0

danforbes · 2023-05-06T15:03:07Z

Is this wrong?

https://github.com/rustformers/llm/blob/be56c36/crates/models/gpt2/src/lib.rs#L314-L316

philpax · 2023-05-06T17:32:29Z

Ok, just tested with https://huggingface.co/xzuyn/GPT-2-124M-ggml-q4_1/blob/main/ggml-model-q4_1.bin on macOS:

# cargo run --bin llm gpt2 infer -m models/gpt2/GPT-2-124M-ggml-q4_1.bin -p "1 + 2 = "  
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/llm gpt2 infer -m models/gpt2/GPT-2-124M-ggml-q4_1.bin -p '1 + 2 = '`
✓ Loaded 149 tensors (125.8 MB) after 153ms
zsh: segmentation fault  cargo run --bin llm gpt2 infer -m models/gpt2/GPT-2-124M-ggml-q4_1.bin -p

philpax · 2023-05-06T17:35:34Z

Is this wrong?

https://github.com/rustformers/llm/blob/be56c36/crates/models/gpt2/src/lib.rs#L314-L316

Aha - I think you've figured it out...

Running with --num-ctx-tokens 1024 doesn't segfault for me. Our default of 2048 doesn't work for all models. Oops.

Or maybe not.

# cargo run --release --bin llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_1.bin -p "Fred looked at his hand and wondered: "  --num-ctx-tokens 512 
    Finished release [optimized] target(s) in 0.08s
     Running `target/release/llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_1.bin -p 'Fred looked at his hand and wondered: ' --num-ctx-tokens 512`
✓ Loaded 389 tensors (5.6 GB) after 91ms
zsh: segmentation fault  cargo run --release --bin llm gpt2 infer -m models/gpt2/cerebras-2.7b-q4_1.bi

philpax · 2023-05-06T18:03:36Z

Quick findings with a debugger:

Only seems to happen with the mmap'd model
The segfault occurs here: data is invalid https://github.com/ggerganov/ggml/blob/ff6e03cbcd9bf6e9fa41d49f2495c042efae4dc6/src/ggml.c#L9146
The only place get_rows is used is here:

llm/crates/models/gpt2/src/lib.rs

Lines 152 to 153 in 7c2edb1

&ctx0.op_get_rows(&self.wte, &embd),

&ctx0.op_get_rows(&self.wpe, &position),
Thus, one of these two tensors is likely not loading correctly through mmap
Using a sane context length and --no-mmap seems to circumvent this for now

This is definitely something we should investigate and fix, but not a showstopper for now, I think.

Fairly certain this fixes rustformers#167

philpax added issue:bug Something isn't working app:cli App: the `llm` CLI labels May 1, 2023

danforbes mentioned this issue May 6, 2023

Codegen Implementation #186

Closed

4 tasks

danforbes added a commit to danforbes/llm that referenced this issue May 20, 2023

Context size consistency & fixes

8048012

Fairly certain this fixes rustformers#167

danforbes mentioned this issue May 20, 2023

Context size consistency & fixes #260

Merged

philpax closed this as completed in #260 May 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-2 segfaults when used through the CLI #167

GPT-2 segfaults when used through the CLI #167

philpax commented May 1, 2023

danforbes commented May 2, 2023

danforbes commented May 3, 2023

philpax commented May 3, 2023

danforbes commented May 3, 2023

danforbes commented May 6, 2023

philpax commented May 6, 2023

philpax commented May 6, 2023 •

edited

Loading

philpax commented May 6, 2023

GPT-2 segfaults when used through the CLI #167

GPT-2 segfaults when used through the CLI #167

Comments

philpax commented May 1, 2023

danforbes commented May 2, 2023

danforbes commented May 3, 2023

philpax commented May 3, 2023

danforbes commented May 3, 2023

danforbes commented May 6, 2023

philpax commented May 6, 2023

philpax commented May 6, 2023 • edited Loading

philpax commented May 6, 2023

philpax commented May 6, 2023 •

edited

Loading