Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf-py : fix and simplify quantized shape round-trip #7483

Merged
merged 2 commits into from
May 25, 2024

Conversation

compilade
Copy link
Collaborator

@compilade compilade commented May 23, 2024

#7234 has broken quantized tensor copy in gguf-new-metadata.py. (thanks @CISC for finding this! ref: #7234 (comment))

This was originally reported for IQ4_NL, but I think it affects all quantized tensor types.

(Converted models are fine, no worries. This fixes a crash of gguf-new-metadata.py)

Summary of changes

  • add quant_shape_from_byte_shape and quant_shape_to_byte_shape to convert between shapes
  • GGUFReader reshapes the Numpy array in each ReaderTensor to the shape GGUFWriter expects to receive
  • The shape of the ReaderTensors are left unchanged to avoid changing the behavior of gguf-dump.py

Testing

  • Q8_0
    • @compilade I've tested a round-trip of a Q8_0 bloom model when adding general.description with gguf-new-metadata.py, then removing it, and the resulting model file has the same checksum as the original model.
  • IQ4_NL
    • @compilade Again, a round-trip of bloom, but this time with IQ4_NL. The checksums match.

@compilade compilade added bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix python python script changes labels May 23, 2024
@@ -251,6 +253,7 @@ def _build_tensors(self, start_offs: int, fields: list[ReaderField]) -> None:
tensor_names.add(tensor_name)
ggml_type = GGMLQuantizationType(raw_dtype[0])
n_elems = int(np.prod(dims))
np_dims = tuple(reversed(dims.tolist()))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.tolist() is necessary here to avoid an error when reshaping afterwards, something about the shape being of type np.float64 for some reason.

@compilade compilade added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label May 23, 2024
@mofosyne mofosyne merged commit b83bab1 into master May 25, 2024
11 of 22 checks passed
@compilade compilade mentioned this pull request Jul 27, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug merge ready indicates that this may be ready to merge soon and is just holding out in case of objections python python script changes Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants