Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml-cuda : add TQ2_0 kernels, for ternary inference on GPU #11183

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Prev Previous commit
Next Next commit
ggml-metal : supports_op returns false for ternary types
Maybe not the cleanest way, but hopefully temporary.
  • Loading branch information
compilade committed Jan 12, 2025
commit b6fc9f03ab532e948458a93db841e6b87727bd9d
12 changes: 12 additions & 0 deletions ggml/src/ggml-metal/ggml-metal.m
Original file line number Diff line number Diff line change
Expand Up @@ -1081,6 +1081,18 @@ static bool ggml_metal_supports_op(const struct ggml_backend_metal_device_contex
}
}
}
// TODO: remove once proper support is added.
for (size_t i = 0, n = 3; i < n; ++i) {
if (op->src[i] != NULL) {
switch (op->src[i]->type) {
case GGML_TYPE_TQ1_0:
case GGML_TYPE_TQ2_0:
return false;
default:
break;
}
}
}

switch (op->op) {
case GGML_OP_UNARY:
Expand Down