-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Documentation] Display the download size for each dataset. #120
Comments
I can work on this issue. Can you please assign this to me ? Also please tell me what to do. Should I add a new column on the md file ? |
@dynamicwebpaige I also want to work on this Issue. |
@ParthS007 lets collaborate together. |
Hi , |
Some pointers on this: |
I would also like to contribute in this issue, warning the user by displaying the size of the dataset he wants to download. Let's collaborate!! |
@anupam-tripathi Please note that the download size is already displayed when downloading a dataset: https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/core/dataset_builder.py#L387 Without downloading, the user should already be able to get the info with:
This issue is mostly to expose this information in the webpage doc. |
I have generated a new dataset documentation here. There are some datasets where I get the size as 0. Cannot make out why this happens. But when I run the following code in colab builder = tfds.builder('imagenet2012')
print(builder.info.size_in_bytes) it prints 0. Will I generate a new pull request so that you can see the changes made by me in |
Oh, sorry about this. Yes, ImageNet was a bad example because it is not automatically downloaded (due to the ImageNet licence, it has to be manually downloaded by the user) Otherwise, it is possible that most recent datasets do not have size information yet. We are pre-computing the |
Yes, please generate a pull request with your changes. Also note that there is
|
@Conchylicultor I have generated a pull request here. Please check. |
@Conchylicultor, I have also added them as Table form in the starting of the Docs. Please review my PR here. Thanks :) |
Currently, just by looking at the list of datasets available in TFDS, there is no way to know the size of each dataset prior to downloading. Users may be operating under constrained disk space, and should be informed of the size of the dataset before requesting.
This feature enhancement would detail the download size of each dataset on the markdown file referenced above.
The text was updated successfully, but these errors were encountered: