-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea: Use array type for embedding speed-up #2059
Comments
I have not tested with the array type but I'm also curious if it could provide similar improvements. Would you be able to share some examples of what the changes we'd need to make would look like? |
Looking at what numpy is used for in the embeddings type, that is taking a base64 bytes object as a buffer (non-copy reference), then converting it into a compact float32 single-dimension array in numpy, then back out to a list of native floats, you can do that with the builtin array type: embedding.embedding = array.array("f", base64.b64decode(data)).tolist() Have submitted this in a draft PR |
Benchmark: import array
import base64
import numpy as np
import json
# Sample data
data = ''
as_json = json.dumps(array.array("f", base64.b64decode(data)).tolist())
def bench_standard():
for _ in range(1000):
json.loads(as_json)
def bench_array():
for _ in range(1000):
array.array("f", base64.b64decode(data)).tolist()
def bench_numpy():
for _ in range(1000):
np.frombuffer( # type: ignore[no-untyped-call]
base64.b64decode(data), dtype="float32"
).tolist()
__benchmarks__ = [
(bench_standard, bench_array, "Standard vs array"),
(bench_standard, bench_numpy, "Standard vs numpy"),
(bench_numpy, bench_array, "Array vs numpy"),
] Replace Results show this array approach is equivalent to the numpy one (10-20% faster) and is significantly faster than the standard approach (10x):
|
Sorry, I'm reading my own benchmark data wrong. It's faster than numpy |
Benchmark comparing the pydantic parser which openai uses to the array and numpy approaches: import array
import base64
import numpy as np
import json
import pydantic
# Sample data
data = ''
class Float32Array(pydantic.BaseModel):
data: list[float]
as_json = json.dumps({"data": array.array("f", base64.b64decode(data)).tolist()})
def bench_standard():
for _ in range(1000):
Float32Array.model_validate_json(as_json)
def bench_array():
for _ in range(1000):
array.array("f", base64.b64decode(data)).tolist()
def bench_numpy():
for _ in range(1000):
np.frombuffer( # type: ignore[no-untyped-call]
base64.b64decode(data), dtype="float32"
).tolist()
__benchmarks__ = [
(bench_standard, bench_array, "Standard vs array"),
(bench_standard, bench_numpy, "Standard vs numpy"),
(bench_numpy, bench_array, "numpy vs array"),
]
If this approach was the new default, it is 4x than the current pydantic parser and 20% faster than the numpy decoder |
Closing this as the PR was merged, thanks! |
Confirm this is a feature request for the Python library and not the underlying OpenAI API.
Describe the feature or improvement you're requesting
The SDK currently uses numpy to speed up embedding:
It does seem to improve performance, based on our tests, but we were wondering if similar gains could be made without numpy, using the built-in array type? Have you tried that already?
https://docs.python.org/3/library/array.html
We're having some pains with the numpy dependency for our Azure samples and are looking for ways to move off it without affecting performance.
Additional context
No response
The text was updated successfully, but these errors were encountered: