diff --git a/README.md b/README.md index 63bb8f5..d49f479 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # bert.cpp [ggml](https://github.com/ggerganov/ggml) inference of BERT neural net architecture with pooling and normalization from [SentenceTransformers (sbert.net)](https://sbert.net/). -High quality sentence embeddings in pure C++ (or C). +High quality sentence embeddings in pure C++ (with C API). ## Description The main goal of `bert.cpp` is to run the BERT model using 4-bit integer quantization on CPU @@ -15,12 +15,10 @@ The main goal of `bert.cpp` is to run the BERT model using 4-bit integer quantiz ## Limitations & TODO * Tokenizer doesn't correctly handle asian writing (CJK, maybe others) -* Inputs longer than ctx size are not truncated. If you are trying to make embeddings for longer texts make sure to truncate. * bert.cpp doesn't respect tokenizer, pooling or normalization settings from the model card: * All inputs are lowercased and trimmed * All outputs are mean pooled and normalized -* The API is in C++ (uses things from std::) -* Doesn't support batching, which means it's slower than it could be in usecases where you have multiple sentences +* Batching support is WIP. Lack of real batching means that this library is slower than it could be in usecases where you have multiple sentences ## Usage diff --git a/examples/a.out b/examples/a.out deleted file mode 100755 index 26ac19c..0000000 Binary files a/examples/a.out and /dev/null differ diff --git a/examples/dylib.cpp b/examples/dylib.cpp index b786814..047ad9f 100644 --- a/examples/dylib.cpp +++ b/examples/dylib.cpp @@ -41,6 +41,7 @@ class BertModel { int main() { BertModel model("../models/all-MiniLM-L6-v2/ggml-model-f16.bin"); /* + Potential api, not implemented: auto embeddings = model.encode("siikahan se siellä"); for (auto embedding : embeddings) { std::cout << embedding << " ";