-
-
Notifications
You must be signed in to change notification settings - Fork 971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add concurrency.iterator_to_async, tests and docs #457
Conversation
|
||
Then you can use it with a `StreamingResponse`. | ||
|
||
This is specially useful for synchronous <a href="https://docs.python.org/3/glossary.html#term-file-like-object" target="_blank">file-like</a> or streaming objects, like those provided by cloud storage providers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomchristie This line might be too verbose, let me know if you want me to remove it (or anything else).
Okay - that's really interesting. At this point in time I'd like to close it off - we don't use it internallly, and it's not yet sufficiently clear to me exactly what kind of third party libraries I'd expect it to integrate against. I'd be a bit wary of folks not understanding the use-case for it. Do you have something in particalar that you're intergrating against that you're using this approach with? |
Yes, there are two specific use cases that I'm already using it in: I'm integrating it with Minio (AWS S3 like), I get an object that I can iterate over, but is not I checked with the docs for Google Cloud Storage, AWS S3, and Azure, all of them allow having the stream be saved to a file-like object. Then that file-like object could be returned as a stream, without having to save it on disk. But these file-like objects are standard iterators, not async iterators. I think S3, *Google Cloud Storage, and Azure Storage are probably going to be the main use cases for this functionality. So, returning one of these iterators (or file-like objects) in a The second use case, is doing computer vision over images and sending them as a stream. But all these computer vision tools work with standard (non-async) streams/iterators, so, I have to convert them to async iterators. In this case, I can't even do the "hack" of saving the file locally to return it later, as it's an infinite stream of frames in memory. A third use case that I have thought about (but haven't implement in a real project yet): Storing some binary in a database, maybe a NoSQL, like Couchbase or Mongo. Or even Redis. Then getting that binary data and streaming it directly. As the drivers/SDKs/connections for these tools still tend to be synchronous, it wouldn't be possible to take the stream in memory and return it in a streaming response directly. Let me suggest/ask/propose. What if I implement a Then, if you prefer, I can just document This would also work for returning file-like objects. About I'm using Starlette's I see this Let me know what you think. |
Okay, so something I'd be more inclinded to accept would be... Have (I'd also be interested in if there's any nice ways around we could use something like this and drop |
It makes more sense to me to have it in, if we're also using it within |
Awesome! I'm on it 😁 🤓 |
Add
concurrency.iterator_to_async
.This is another utility function similar to
run_in_threadpool
(and uses it inside).It takes a normal (non-async) iterator and returns an async generator.
There are some libraries that return a stream object that can be iterated normally but not with async.
Wrapping those objects with
iterator_to_async
would allow returning them inStreamingResponse
s.For example, file-like and similar objects, especially when they don't have a physical path that can be passed to
FileResponse
s.A notable example is the kinds of objects from cloud storage providers (Minio, Amazon S3, Google Cloud Storage, Azure Storage).