Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cli] Uploads: add progress bar #2078

Merged
merged 3 commits into from
Dec 6, 2019
Merged

[cli] Uploads: add progress bar #2078

merged 3 commits into from
Dec 6, 2019

Conversation

julien-c
Copy link
Member

@julien-c julien-c commented Dec 6, 2019

see #2044 (comment) for context

There might be a more pythonic way (to do a "simple" method overriding) but I couldn't find it.

@codecov-io
Copy link

codecov-io commented Dec 6, 2019

Codecov Report

Merging #2078 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2078      +/-   ##
==========================================
+ Coverage   83.16%   83.18%   +0.01%     
==========================================
  Files         109      109              
  Lines       15858    15874      +16     
==========================================
+ Hits        13188    13204      +16     
  Misses       2670     2670
Impacted Files Coverage Δ
transformers/hf_api.py 97.5% <100%> (+0.62%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35ff345...5543617. Read the comment docs.

@@ -148,6 +152,27 @@ def list_objs(self, token):



class ProgressFileReader:
Copy link
Contributor

@rlouf rlouf Dec 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely non-standard, but definitely clever :)

I don't think many people would understand this behavior, as you have to know values are passed by reference in Python, and be familiar with io.BufferedReader 's API. I would document the ProgressFileReader class to explain why we are doing this and why it works.

A more pythonic API would be to promote ProgressFileReader to a context manager and open the file in a context so the uploading code would look like:

with stream_with_progressbar(filepath, "rb") as f:
       r = requests.put(urls.write, data=f, headers={
              "content-type": urls.type,
       })
       r.raise_for_status()

Also has the merit of being self-documenting and hiding the complex logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do a context manager, but I can't encapsulate f completely like in your code sample, (passing f to requests.put) as I don't really know which methods of the io.BufferedReader requests (and its underlying libraries) is going to call.

The cleanest way would be to subclass io.BufferedReader and instantiate an instance of the child class with the parent instance, but I don't know how to do that in Python.

Copy link
Member

@thomwolf thomwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, I think it's fine.

Might even be reused in other locations of the library and examples later so maybe put it in file_utils.py and add it to __init__?

@@ -129,10 +130,13 @@ def presign_and_upload(self, token, filename, filepath):
# Even though we presign with the correct content-type,
# the client still has to specify it when uploading the file.
with open(filepath, "rb") as f:
pf = ProgressFileReader(f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call it TqdmProgressFileReader if it's included in file_utils?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed it, let's move it to file_utils if/when we need it somewhere else.

@julien-c julien-c merged commit e4679cd into master Dec 6, 2019
@julien-c julien-c deleted the cli_upload_progress branch December 6, 2019 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants