-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] add support for file_bytes argument with managed_file_context() #270
base: master
Are you sure you want to change the base?
Conversation
@@ -24,19 +27,33 @@ class PDFHandler(object): | |||
|
|||
Parameters | |||
---------- | |||
filepath : str | |||
Filepath or URL of the PDF file. | |||
filepath : str | pathlib.Path, optional (default: None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I elected to keep the arguments separate instead of combining them like pandas.read_csv
(or any of the others) do. Mostly to preserve the existing API kwargs
i.e. I did not want to rename this argument file_path_or_bytes
@@ -49,6 +66,28 @@ def __init__(self, filepath, pages="1", password=None): | |||
self.password = self.password.encode("ascii") | |||
self.pages = self._get_pages(pages) | |||
|
|||
@contextmanager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the meat of it. Variably opens a file handle or passes the bytes through, depending on the case.
def download_url(url): | ||
"""Download file from specified URL. | ||
def get_url_bytes(url): | ||
"""Get a stream of bytes for url | ||
|
||
Parameters | ||
---------- | ||
url : str or unicode | ||
|
||
Returns | ||
------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is only moderately involved with my feature, but this is an anti-pattern in my opinion.
with tempfile.NamedTemporaryFile("wb", delete=False) as f:
...
filepath = os.path.join(os.path.dirname(f.name), filename)
shutil.move(f.name, filepath)
Trying to maintain this file outside of the context manager provided by tempfile
is at best not the intention of the module. Using BytesIO is going to incur somewhat more memory strain in these cases, but I could easily see the existing implementation causing bugs in some cases (either now or in the future).
@@ -107,7 +146,7 @@ def _save_page(self, filepath, page, temp): | |||
Tmp directory. | |||
|
|||
""" | |||
with open(filepath, "rb") as fileobj: | |||
with self.managed_file_context() as fileobj: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a bug
I needed this functionality and since the original author hasn't yet merged it I cloned the repo and merged locally. Seems to be working as intended, thanks for contributing. |
Is this still stalled? Would love this functionality if still possible |
Unfortunately this repo looks to be somewhat abandoned, @vinayak-mehta has not merged any code since July 2021. It's a really useful tool, so it would be a shame to let it decay. Does anybody have interest in being a maintainer? |
Consider starting a discussion on gitter? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this cocntribution!!
Quick code review, without functionally testing it.
Would love to see if the tests are green.
@foarsitter Can you trigger the tests? :Pray: |
Any update? |
@cscanlin, by any change, do you have time to rebase this and run black/isort? |
Hey! As camelot is dead, we try to build a maintained fork at Do you want to open the PR against that branch so that we can merge your improvement? |
@MartinThoma Hi. Has anyone merged, or plan to do so, this PR on your fork? I could use this feature. I guess I could clone it and open the PR in your project. |
@Johnmaras Please go ahead and open a PR there. 🙂 |
Hi @bosd. Let me know. |
I'm a contributor / maintainer there. I can have a look to resolve the conflicts. ( May take some time, kinda busy lately) |
Understood. I'll proceed with the PR as soon as possible. |
Fixes #170 and #245