Skip to content

Commit

Permalink
Update readme with API and batching changes
Browse files Browse the repository at this point in the history
  • Loading branch information
skeskinen committed May 1, 2023
1 parent 96514fb commit 6ca5c6c
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 4 deletions.
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# bert.cpp

[ggml](https://github.com/ggerganov/ggml) inference of BERT neural net architecture with pooling and normalization from [SentenceTransformers (sbert.net)](https://sbert.net/).
High quality sentence embeddings in pure C++ (or C).
High quality sentence embeddings in pure C++ (with C API).

## Description
The main goal of `bert.cpp` is to run the BERT model using 4-bit integer quantization on CPU
Expand All @@ -15,12 +15,10 @@ The main goal of `bert.cpp` is to run the BERT model using 4-bit integer quantiz

## Limitations & TODO
* Tokenizer doesn't correctly handle asian writing (CJK, maybe others)
* Inputs longer than ctx size are not truncated. If you are trying to make embeddings for longer texts make sure to truncate.
* bert.cpp doesn't respect tokenizer, pooling or normalization settings from the model card:
* All inputs are lowercased and trimmed
* All outputs are mean pooled and normalized
* The API is in C++ (uses things from std::)
* Doesn't support batching, which means it's slower than it could be in usecases where you have multiple sentences
* Batching support is WIP. Lack of real batching means that this library is slower than it could be in usecases where you have multiple sentences

## Usage

Expand Down
Binary file removed examples/a.out
Binary file not shown.
1 change: 1 addition & 0 deletions examples/dylib.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ class BertModel {
int main() {
BertModel model("../models/all-MiniLM-L6-v2/ggml-model-f16.bin");
/*
Potential api, not implemented:
auto embeddings = model.encode("siikahan se siellä");
for (auto embedding : embeddings) {
std::cout << embedding << " ";
Expand Down

0 comments on commit 6ca5c6c

Please sign in to comment.