gguf-py : fix and simplify quantized shape round-trip #7483

compilade · 2024-05-23T04:06:50Z

#7234 has broken quantized tensor copy in gguf-new-metadata.py. (thanks @CISC for finding this! ref: #7234 (comment))

This was originally reported for IQ4_NL, but I think it affects all quantized tensor types.

(Converted models are fine, no worries. This fixes a crash of gguf-new-metadata.py)

Summary of changes

add quant_shape_from_byte_shape and quant_shape_to_byte_shape to convert between shapes
GGUFReader reshapes the Numpy array in each ReaderTensor to the shape GGUFWriter expects to receive
The shape of the ReaderTensors are left unchanged to avoid changing the behavior of gguf-dump.py

Testing

Q8_0
- @compilade I've tested a round-trip of a Q8_0 bloom model when adding general.description with gguf-new-metadata.py, then removing it, and the resulting model file has the same checksum as the original model.
IQ4_NL
- @compilade Again, a round-trip of bloom, but this time with IQ4_NL. The checksums match.

compilade · 2024-05-23T04:14:38Z

gguf-py/gguf/gguf_reader.py

@@ -251,6 +253,7 @@ def _build_tensors(self, start_offs: int, fields: list[ReaderField]) -> None:
            tensor_names.add(tensor_name)
            ggml_type = GGMLQuantizationType(raw_dtype[0])
            n_elems = int(np.prod(dims))
+            np_dims = tuple(reversed(dims.tolist()))


.tolist() is necessary here to avoid an error when reshaping afterwards, something about the shape being of type np.float64 for some reason.

gguf-py : fix and simplify quantized shape round-trip

2ff601f

compilade added bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix python python script changes labels May 23, 2024

compilade mentioned this pull request May 23, 2024

convert-hf : support direct Q8_0 conversion #7234

Merged

13 tasks

gguf-py : remove unused import

c5fe1d6

compilade commented May 23, 2024

View reviewed changes

ggerganov approved these changes May 23, 2024

View reviewed changes

compilade added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label May 23, 2024

mofosyne merged commit b83bab1 into master May 25, 2024
11 of 22 checks passed

compilade mentioned this pull request Jul 27, 2024

Embed files #8121

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf-py : fix and simplify quantized shape round-trip #7483

gguf-py : fix and simplify quantized shape round-trip #7483

compilade commented May 23, 2024 •

edited

Loading

compilade May 23, 2024

gguf-py : fix and simplify quantized shape round-trip #7483

gguf-py : fix and simplify quantized shape round-trip #7483

Conversation

compilade commented May 23, 2024 • edited Loading

Summary of changes

Testing

compilade May 23, 2024

Choose a reason for hiding this comment

compilade commented May 23, 2024 •

edited

Loading