Skip to content

Tags: ngxson/llama.cpp

Tags

b4854

Toggle b4854's commit message

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
ggml : skip intermediate .air file when compiling .metallib (ggml-org…

…#12247)

This commit updates the compilation of default.metallib to skip the
intermediate .air (Apple Intermediate Representation) file.

The motivation for this change is to simplify the custom command a
little and avoid generating and then removing the .air file.

b4853

Toggle b4853's commit message
sync : ggml

ggml-ci

b4851

Toggle b4851's commit message

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
ggml-cpu: faster AVX2 variant for IQ1_M (ggml-org#12216)

b4849

Toggle b4849's commit message

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
server : Log original chat template parsing error (ggml-org#12233)

b4848

Toggle b4848's commit message

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
sync: minja - support QwQ-32B (ggml-org#12235)

google/minja@8a76f78

b4847

Toggle b4847's commit message

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
metal : simplify kernel arguments using a struct (ggml-org#3229) (ggm…

…l-org#12194)

* metal : refactor im2col parameters into a struct

* metal: Change im2col offset types from int32_t to uint64_t to support larger memory offsets

* metal : refactor sum_rows parameters into a struct

* metal : refactor soft_max parameters into a struct

* metal : refactor diag_mask_inf parameters into a struct

* metal : refactor ssm_conv parameters into a struct

* metal : refactor ssm_scan parameters into a struct

* metal : refactor get_rows parameters into a struct

* metal : refactor group_norm parameters into a struct

* metal : refactor conv_transpose_1d parameters into a struct

* metal : refactor upscale parameters into a struct

* metal : refactor pad parameters into a struct

* metal : refactor pad_reflect_1d parameters into a struct

* metal : refactor arange parameters into a struct

* metal : refactor timestep_embedding parameters into a struct

* metal : refactor argsort parameters into a struct

* metal : refactor leaky_relu parameters into a struct

* metal : refactor pool_2d parameters into a struct

* metal : fix trailing whitespace

---------

Co-authored-by: alexju <[email protected]>

b4846

Toggle b4846's commit message

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
HIP: fix rocWMMA build flags under Windows (ggml-org#12230)

b4837

Toggle b4837's commit message

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
HIP/CUDA: set the paramerter value in maintain_cuda_graph instead of …

…replaceing it. (ggml-org#12209)

This avoids conflict with internal cuda/hip runtimes memory managment behavior.

b4835

Toggle b4835's commit message

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
opencl : fix buffer alignment (ggml-org#12197)

Fix the following error:

```
ggml-alloc.c:99: not enough space in the buffer
ggml_tallocr_alloc: not enough space in the buffer to allocate blk.17.ffn_down.weight (needed 27525120, available 27521024)
```

which occurs when `ggml_backend_opencl_context::alignment` is larger
than `cl_ptr_base` (hard-coded to `0x1000`).

Also, fix `ggml_backend_opencl_context::alignment` was set to
`CL_DEVICE_MEM_BASE_ADDR_ALIGN` which was treated as bytes but the
value is reported in bits.

b4833

Toggle b4833's commit message

Verified

This commit was created on github.com and signed with GitHub’s verified signature.
opencl : fix profile-related errors (ggml-org#12095)

Co-authored-by: ubuntu <[email protected]>