-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support previously opened filehandles as input to xopen.open() #150
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thanks for the feedback. I didn't rename as this isn't my code :-) But I'll keep an eye on PR 148 and revise mine afterwards! |
Hi @geertvandeweyer, PR #148 is now merged and, as expected, there are some conflicts. Let us know if you need help fixing them. |
I've addressed most comments I think. I'd welcome some feedback on a few issues though :
|
Currently the That said, I agree with you. The python stdlib |
Oh, I forgot to mention that I am willing to help with the tests as soon as you get it in a state that you feel it is ready. Please ping me when this is the case. |
ok, thanks for helping with the tests! If the setup of allowing both file (default) and filename (with a warning) is fine, then the code is ready afaik. I've tested what I can and it seems ok :-) If either filename or file should be supported, I can change the code again. Just let me know what you prefer. |
I think we should not support both. That is unnecessary complexity. I see builtins.open uses However the compression modules, which offer an interface that is closer to xopen use And then there is of course the backwards compatibility argument that puts me in the Really, either of these names is fine, but since |
It’s a little bit annoying that it’s called (Another thought: We could also turn it into a positional-only parameter. I don’t have a strong opinion either way. I guess |
Removed the file/filename in favor of filename only, as discussed. |
Fixing the tests was easy enough, but the type system has a lot of complaints. |
I am building on top of this pr in: https://github.com/pycompression/xopen/tree/acceptfilehandles The major change I did was that internally now everything is a binary stream. The pipedcompressionprogram now also works with reading streams: I created an extra thread that simply reads from the file and writes it to the stdin of the program. This has numerous advantages:
There are still some bugs I need to iron out, but it is getting there. Slowly. I really need to work on other things today as well, so I will leave it be for now. |
thanks for picking up on this. I'll leave it to you to finalize this then. Feel free to close this PR. |
Done. See #152 |
Minimal changes were applied to support passing open binary filehandles to xopen.xopen().
The tests were successful for all supported compression types, for both reading and writing, but I might have missed some of library dependencies.
Use case would be (and tested for) :
f_smart = smart_open('s3://mybucket/myfile.fastq.gz','wb', compression='disable')
f = xopen.xopen(f_smart,mode='wb')
I've noticed some mypy issues in the tox check. These are related to potentially conflicting object types. This could be fixed by explicitly renaming some variables to split flows depending on the input type (afaik). Let me know if you'd want this, or these issues can be ignored/whitelisted (?)