-
Hello, I'm trying to create a project like llama.cpp, whisper.cpp, for a different neural network. The first part of the neural network is a Linear layer with shape My input is a 2D Tensor of shape
On the GGML side, I imitated the whisper.cpp pt-to-ggml script to convert the PyTorch pth files to the ggml format. Then, I imitated whisper.cpp to load the weights from the file into a Tensor. The operations of saving to a ggml bin and loading it seem to be working well: Converting pth to ggml format:
Loading ggml format in my main program:
Now, I reach the part of my code where I want to apply inference of this Linear layer + BatchNorm1d:
The print statements of the number of elements are:
How can I multiply |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Change the order of the struct ggml_tensor *input = ggml_new_tensor_2d(model.ctx, GGML_TYPE_F32, 2974, nb_frames); Use x = ggml_mul_mat(model.ctx, model.fc1_w[0], x); This is a bit unusual compared to normal frameworks. The |
Beta Was this translation helpful? Give feedback.
Change the order of the
input
dimensions like this:Use
ggml_mul_mat()
like this:x = ggml_mul_mat(model.ctx, model.fc1_w[0], x);
This is a bit unusual compared to normal frameworks. The
ggml_mul_mat()
technically computeszT = x * yT
.Hope this helps.