How to apply a PyTorch Linear layer to a 2D tensor with GGML #297

sevagh · 2023-06-25T13:30:54Z

sevagh
Jun 25, 2023

Hello,

I'm trying to create a project like llama.cpp, whisper.cpp, for a different neural network.

The first part of the neural network is a Linear layer with shape (in_features=2974, out_features=1024), followed by a BatchNorm1d of shape (1024,). This is an encoder of 2974 input features to a 1024 hidden size (pretty basic stuff).

My input is a 2D Tensor of shape (X, 2974). In PyTorch the code goes something like this (adding an extra batch dimension):

fc1 = Linear(in_features=2974, out_features=1024, bias=False)
bn1 = BatchNorm1d(1024)

...

# forward run of network
# input: x: Tensor

print(x.shape) # (1, X, 2974)

x = fc1(x)
print(x.shape) # (1, X, 1024)

x = bn1(x)
print(x.shape) # (1, X, 1024)

On the GGML side, I imitated the whisper.cpp pt-to-ggml script to convert the PyTorch pth files to the ggml format. Then, I imitated whisper.cpp to load the weights from the file into a Tensor.

The operations of saving to a ggml bin and loading it seem to be working well:

Converting pth to ggml format:

Processing variable:  fc1.weight  with shape:  (1024, 2974)
Processing variable:  bn1.weight  with shape:  (1024,)
Processing variable:  bn1.bias  with shape:  (1024,)

Loading ggml format in my main program:

Loading weights from model_file ../ggml-umxl/ggml-custom-model-f32.bin
Loading tensor fc1.weight with shape [2974, 1024]
      fc1.weight: [ 2974,  1024], type =    f32,  11.62 MB
Loading tensor bn1.weight with shape [1024, 1]
      bn1.weight: [ 1024,     1], type =    f32,   0.00 MB
Loading tensor bn1.bias with shape [1024, 1]
      bn1.bias: [ 1024,     1], type =    f32,   0.00 MB

Now, I reach the part of my code where I want to apply inference of this Linear layer + BatchNorm1d:

    struct ggml_tensor *input = ggml_new_tensor_2d(
        model.ctx, GGML_TYPE_F32, nb_frames, 2974);

    // fill input with some values

    // apply network inference
    auto x = input;

    std::cout << x->ne[0] << " " << x->ne[1] << std::endl;
    std::cout << model.fc1_w[0]->ne[0] << " " << model.fc1_w[0]->ne[1] << std::endl;
    x = ggml_mul(model.ctx, model.fc1_w[0], x);

The print statements of the number of elements are:

11340 2974
2974 1024
GGML_ASSERT: /home/sevagh/repos/umx.cpp/src/ggml.c:4935: ggml_can_repeat_rows(b, a)

How can I multiply (11340, 2974) * (2974, 1024) to achieve an output of (11340, 1024)?

Answered by ggerganov

Jun 26, 2023

Change the order of the input dimensions like this:

struct ggml_tensor *input = ggml_new_tensor_2d(model.ctx, GGML_TYPE_F32, 2974, nb_frames);

Use ggml_mul_mat() like this:

x = ggml_mul_mat(model.ctx, model.fc1_w[0], x);

This is a bit unusual compared to normal frameworks. The ggml_mul_mat() technically computes zT = x * yT.
Hope this helps.

View full answer

ggerganov · 2023-06-26T19:58:09Z

ggerganov
Jun 26, 2023
Maintainer

Change the order of the input dimensions like this:

struct ggml_tensor *input = ggml_new_tensor_2d(model.ctx, GGML_TYPE_F32, 2974, nb_frames);

Use ggml_mul_mat() like this:

x = ggml_mul_mat(model.ctx, model.fc1_w[0], x);

This is a bit unusual compared to normal frameworks. The ggml_mul_mat() technically computes zT = x * yT.
Hope this helps.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to apply a PyTorch Linear layer to a 2D tensor with GGML #297

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

How to apply a PyTorch Linear layer to a 2D tensor with GGML #297

sevagh Jun 25, 2023

Replies: 1 comment

ggerganov Jun 26, 2023 Maintainer

sevagh
Jun 25, 2023

ggerganov
Jun 26, 2023
Maintainer