-
-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running requests in parallel from a zip archive can create race condition when unpacking the cacerts.pem file #5223
Comments
This can be reproduced with running pip in parallel too. |
Genuinely, it'd be best if there was a way not have to extract |
To be more specific:
|
What about using an application data folder to extract this (no cleanup), and do the writing in parallel safe manner (mkdir with exist ok)? This is how virtualenv handles this and no one - complained yet for adding a few kb files in application data folder). |
We've had multiple complaints just about the fact that we write to a directory. I think previously we used a more predictable (between python interpreters) directory and that got people riled up too. In general, people don't want us creating files at all. The 100% best solution is the standard library Also, can you clarify your definition of "application data folder" here? I think it's what we've done before and it opens us up to another utility in the user's space changing the file thus allowing something to corrupt the trust store (which is very bad) |
A user write able temp folder is almost as easy to exploit as a user write able temp folder we have now. That being said I think for this issue the easiest solution is to put the extract operation in a try catch and silently swallow the exception if the file already exist. The race condition would no longer throw an exception and everything would work as expected. |
#5707 works around the issue by keeping the current solution but without the racing condition imposed by the os makedirs. |
@sigmavirus24 alternatively we can export this into a context manager handled temporary file on a per |
#5707) Extract also creates the folder hierarchy, however we do not need that, the file itself being extracted to a temporary folder is good enough. Instead we read the content of the zip and then write it. The write is not locked but it's OK to update the same file multiple times given the update operation will not alter the content of the file. By not creating the folder hierarchy (default via extract) we no longer can run into the problem of two parallel extracts both trying to create the folder hierarchy without exists ok flag, and one must fail. Resolves #5223. Signed-off-by: Bernát Gábor <[email protected]>
psf#5707) Extract also creates the folder hierarchy, however we do not need that, the file itself being extracted to a temporary folder is good enough. Instead we read the content of the zip and then write it. The write is not locked but it's OK to update the same file multiple times given the update operation will not alter the content of the file. By not creating the folder hierarchy (default via extract) we no longer can run into the problem of two parallel extracts both trying to create the folder hierarchy without exists ok flag, and one must fail. Resolves psf#5223. Signed-off-by: Bernát Gábor <[email protected]>
psf#5707) Extract also creates the folder hierarchy, however we do not need that, the file itself being extracted to a temporary folder is good enough. Instead we read the content of the zip and then write it. The write is not locked but it's OK to update the same file multiple times given the update operation will not alter the content of the file. By not creating the folder hierarchy (default via extract) we no longer can run into the problem of two parallel extracts both trying to create the folder hierarchy without exists ok flag, and one must fail. Resolves psf#5223. Signed-off-by: Bernát Gábor <[email protected]>
psf#5707) Extract also creates the folder hierarchy, however we do not need that, the file itself being extracted to a temporary folder is good enough. Instead we read the content of the zip and then write it. The write is not locked but it's OK to update the same file multiple times given the update operation will not alter the content of the file. By not creating the folder hierarchy (default via extract) we no longer can run into the problem of two parallel extracts both trying to create the folder hierarchy without exists ok flag, and one must fail. Resolves psf#5223. Signed-off-by: Bernát Gábor <[email protected]>
We ran into a really crazy case and I understand this is a edge case but it might be worth fixing.
We started to see this backtrace in our CI:
After a lot of confusion I think I understand this bug now. We distribute our python dependencies (including requests) as a pyz (created with zipapp) and the consumer in this case calls request inside a
ThreadPool
. requests then have logic to unpackcacerts.pem
into the temp directory, but there is no race protection. So our parallel threads stepped on each other toes here when unpacking this file.We solved this by a simple get call before starting the parallel invocation. But I think it might be worth fixing because it's very confusing.
Expected Result
Not having the cacert being overwritten :)
Actual Result
Exception above
Reproduction Steps
create a pyz with zipapp of requests and it's dependencies (certifi)
note that since it's a race it can trigger or not trigger a lot.
System Information
This command is only available on Requests v2.16.4 and greater. Otherwise,
please provide some basic information about your system (Python version,
operating system, &c).
The text was updated successfully, but these errors were encountered: