-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of memory when running az storage blob upload #1105
Comments
Hi @sebasmannem thank you for the issue report! |
It appears this is a bug in the python storage SDK. Azure/azure-storage-python#190 I've referenced your issue in that thread. |
@troydai since you are working on the upload command, we should probably implement a workaround until this is fixed in the service. Basically set max_connections to 1 if the file being uploaded is large. Possibly log a warning when we do so. |
Updating sprint |
Workaround is in, so I'm moving this to the backlog. When the SDK is fixed, the workaround can be removed and this can be closed. |
@tjprescott Any ETA on when the fix for this will be public on pypi? Also, won't it negatively impact performance significantly to only use edit: I just realized I confused myself a bit -- the workaround is in |
@mattchr the fix will be in for the January 14th release. It will slow performance for large files, but that's better than crashing due to running out of memory. When the storage SDK is fixed, we will remove the workaround. |
This should be fixed, I'll take a look to make sure |
Was fixed with Azure/azure-storage-python#190 |
I'm using az to upload an image to azure.
I'm using python3, but have seen the same issue with python2.
Image is 30G (although only 1,5 G sparse).
Eventually I managed to upload using --max-connections 1.
I traced the issue down to this little piece of code:
file: storage/blob/_upload_chunking.py, line:70
Problem is with the line
calling executor.map with parameter uploader.get_chunk_streams() creates a list with all the element that are yielded in get_chunk_streams().
This list holds all 30G of file data and is built in memory before passing it on to executor.map().
So, if you want to upload with maxconnections > 1, basically, you need at least (memory+swap) larger than file you wish to upload...
The text was updated successfully, but these errors were encountered: