Add support for GLM-Edge and GLM-Edge-V series models #10573

piDack · 2024-11-29T05:53:51Z

This pull request support for the GLM-Edge-Chat 1.5B & 4B and GLM-Edge-V 2B & 5B series of models within the llama.cpp.

Note: The current model pretrain -> gguf only supports using the transformers version 4.47.0.dev0.

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

…port_glm_edge_model

src/llama.cpp

Co-authored-by: Xuan Son Nguyen <[email protected]>

arch-btw · 2024-12-11T17:23:43Z

Works great.

piDack · 2024-12-19T08:51:22Z

Is there anyone available to review the code?

arch-btw · 2024-12-24T21:05:52Z

@piDack As this would be adding support for GlmForCausalLM for these vision models, I'm curious to know if we could create a more modular or generic implementation that could also be used for the other GlmForCausalLM model(s)?

I'm asking because glm-4-9b-chat-hf is currently broken with the new transformers-only implementation:

python convert_hf_to_gguf.py /home/Models/glm-4-9b-chat-hf --outtype f32
INFO:hf-to-gguf:Loading model: glm-4-9b-chat-hf
ERROR:hf-to-gguf:Model GlmForCausalLM is not supported

The version with the custom python files still works but if we're moving away from that (related discussion), it might be best to support GlmForCausalLM in general.

Are there any parts of this PR that could be refactored or generalized for broader applicability so that we can support both and maybe upcoming models? Thank you.

piDack · 2025-01-07T09:39:35Z

@piDack As this would be adding support for GlmForCausalLM for these vision models, I'm curious to know if we could create a more modular or generic implementation that could also be used for the other GlmForCausalLM model(s)?

I'm asking because glm-4-9b-chat-hf is currently broken with the new transformers-only implementation:
python convert_hf_to_gguf.py /home/Models/glm-4-9b-chat-hf --outtype f32
INFO:hf-to-gguf:Loading model: glm-4-9b-chat-hf
ERROR:hf-to-gguf:Model GlmForCausalLM is not supported
The version with the custom python files still works but if we're moving away from that (related discussion), it might be best to support GlmForCausalLM in general.

Are there any parts of this PR that could be refactored or generalized for broader applicability so that we can support both and maybe upcoming models? Thank you.

I will tried to do it

piDack · 2025-01-26T08:02:21Z

@piDack As this would be adding support for GlmForCausalLM for these vision models, I'm curious to know if we could create a more modular or generic implementation that could also be used for the other GlmForCausalLM model(s)?

I'm asking because glm-4-9b-chat-hf is currently broken with the new transformers-only implementation:
python convert_hf_to_gguf.py /home/Models/glm-4-9b-chat-hf --outtype f32
INFO:hf-to-gguf:Loading model: glm-4-9b-chat-hf
ERROR:hf-to-gguf:Model GlmForCausalLM is not supported
The version with the custom python files still works but if we're moving away from that (related discussion), it might be best to support GlmForCausalLM in general.

Are there any parts of this PR that could be refactored or generalized for broader applicability so that we can support both and maybe upcoming models? Thank you.

done

arch-btw · 2025-01-28T13:15:04Z

Perfect! Thank you @piDack
@ngxson could you please take a look?

src/llama-chat.cpp

src/llama-model.cpp

src/llama.cpp

examples/llava/clip.cpp

src/llama-arch.cpp

examples/llava/clip.cpp

ngxson

many places still having inconsistent style

src/llama-model.cpp

src/llama.cpp

examples/llava/clip.cpp

ngxson · 2025-01-30T16:26:13Z

@ggerganov Could you have a look quickly on llama.cpp? I gave my approval for the rest of changes.

src/llama-model.cpp

src/llama.cpp

piDack · 2025-02-02T03:46:06Z

I believe it's ready to be merged into the master branch.

src/llama-model.cpp

liyuhang and others added 7 commits November 8, 2024 03:33

add glm edge chat model

677058f

Merge branch 'master' of https://github.com/piDack/llama.cpp into sup…

a249dc0

…port_glm_edge_model

use config partial_rotary_factor as rope ratio

4f69662

support for glm edge model

6fc90cb

Merge branch 'master' of https://github.com/piDack/llama.cpp into sup…

ae41d3e

…port_glm_edge_model

vision model support

3b27041

remove debug info

55a6f95

github-actions bot added testing Everything test related examples python python script changes labels Nov 29, 2024

piDack added 4 commits November 29, 2024 06:05

fix format

7d80a4a

llava.cpp trailing whitespace

6c50e9c

Merge branch 'master' of https://github.com/piDack/llama.cpp into sup…

816d93d

…port_glm_edge_model

remove unused AutoTokenizer

6928805

ngxson reviewed Nov 29, 2024

View reviewed changes

src/llama.cpp Outdated Show resolved Hide resolved

piDack and others added 5 commits November 30, 2024 10:29

Update src/llama.cpp for not contain <|end|> or </s>

5ff5632

Co-authored-by: Xuan Son Nguyen <[email protected]>

Merge branch 'master' into support_glm_edge_model

82cbfda

add edge template

3b409c1

fix chat template

6e9fdb0

Merge branch 'ggerganov:master' into support_glm_edge_model

bc93d2a

Merge branch 'master' into support_glm_edge_model

f91cf62

fix confict

24bad77

arch-btw mentioned this pull request Jan 16, 2025

Misc. bug: cannot convert GLM-4-9B-Chat (glm-4-9b-chat-hf) to GGUF format #11263

Open

liyuhang added 3 commits January 26, 2025 10:40

merge

86bce2b

fix confict

d9db092

fix ci err

593cc86

ngxson reviewed Jan 28, 2025

View reviewed changes

src/llama-chat.cpp Outdated Show resolved Hide resolved

src/llama-model.cpp Outdated Show resolved Hide resolved

src/llama.cpp Outdated Show resolved Hide resolved

src/llama.cpp Show resolved Hide resolved

examples/llava/clip.cpp Show resolved Hide resolved

liyuhang added 2 commits January 29, 2025 08:26

format

96bde6f

format clip.cpp

a7054a1

ngxson reviewed Jan 29, 2025

View reviewed changes

examples/llava/clip.cpp Outdated Show resolved Hide resolved

ngxson reviewed Jan 29, 2025

View reviewed changes

src/llama-arch.cpp Outdated Show resolved Hide resolved

ngxson reviewed Jan 29, 2025

View reviewed changes

examples/llava/clip.cpp Outdated Show resolved Hide resolved

liyuhang and others added 2 commits January 30, 2025 09:19

fix format

13961b3

Merge branch 'ggerganov:master' into support_glm_edge_model

0d3ad16

piDack requested a review from ngxson January 30, 2025 12:50

ngxson reviewed Jan 30, 2025

View reviewed changes

Apply suggestions from code review

0536d00

ngxson reviewed Jan 30, 2025

View reviewed changes

examples/llava/clip.cpp Outdated Show resolved Hide resolved

ngxson reviewed Jan 30, 2025

View reviewed changes

examples/llava/clip.cpp Outdated Show resolved Hide resolved

ngxson reviewed Jan 30, 2025

View reviewed changes

examples/llava/clip.cpp Outdated Show resolved Hide resolved

ngxson reviewed Jan 30, 2025

View reviewed changes

examples/llava/clip.cpp Outdated Show resolved Hide resolved

ngxson added 2 commits January 30, 2025 16:19

Apply suggestions from code review

6108d4c

Update examples/llava/clip.cpp

5f13c24

ngxson approved these changes Jan 30, 2025

View reviewed changes

ggerganov reviewed Jan 31, 2025

View reviewed changes

src/llama-model.cpp Outdated Show resolved Hide resolved

src/llama-model.cpp Outdated Show resolved Hide resolved

src/llama-model.cpp Outdated Show resolved Hide resolved

src/llama.cpp Outdated Show resolved Hide resolved

liyuhang and others added 2 commits February 1, 2025 08:28

fix format

77b0b28

Merge branch 'ggerganov:master' into support_glm_edge_model

c8ec97d

piDack requested a review from ggerganov February 1, 2025 01:42

ggerganov approved these changes Feb 2, 2025

View reviewed changes

src/llama-model.cpp Outdated Show resolved Hide resolved

minor : style

31191ad

ggerganov merged commit 0cec062 into ggerganov:master Feb 2, 2025
47 checks passed

ggerganov mentioned this pull request Feb 2, 2025

llama : refactor llama_kv_cache, llama_context and llm_build_context #11213

Draft

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for GLM-Edge and GLM-Edge-V series models #10573

Add support for GLM-Edge and GLM-Edge-V series models #10573

piDack commented Nov 29, 2024

arch-btw commented Dec 11, 2024

piDack commented Dec 19, 2024

arch-btw commented Dec 24, 2024

piDack commented Jan 7, 2025

piDack commented Jan 26, 2025

arch-btw commented Jan 28, 2025

ngxson left a comment

ngxson commented Jan 30, 2025 •

edited

Loading

piDack commented Feb 2, 2025

Add support for GLM-Edge and GLM-Edge-V series models #10573

Add support for GLM-Edge and GLM-Edge-V series models #10573

Conversation

piDack commented Nov 29, 2024

arch-btw commented Dec 11, 2024

piDack commented Dec 19, 2024

arch-btw commented Dec 24, 2024

piDack commented Jan 7, 2025

piDack commented Jan 26, 2025

arch-btw commented Jan 28, 2025

ngxson left a comment

Choose a reason for hiding this comment

ngxson commented Jan 30, 2025 • edited Loading

piDack commented Feb 2, 2025

ngxson commented Jan 30, 2025 •

edited

Loading