add a way to read data files from local path #451

sjoke · 2018-03-20T14:06:34Z

yesterday, on my linux machine,
I found it can not download mnist_data from net,
but the network had no problem,
so I downloaded mnist data by chrome, then modified this mnist.py to train the example.
this costed some time for me.
so I want to provided a way to read mnist data from the path which represented by the root parameter
to help those who meets the same question with me

By the way, this is my first pull request in github, I hope there would be no problem.
I love to use pytorch, I hope pytorch can be better and better

God bless us!

fmassa

Hi,

Thanks for the PR!

I have a few comments, and I think that there is a simpler way of addressing this.

Here is what I propose: I think there is not much need of having the preprocess function either, and the loading of the labels / targets could happen at every dataset initialization. It should be very fast, as we are using numpy's frombuffer (which was not the case when we first implemented this function), see #334 for details.

What do you think?

torchvision/datasets/mnist.py


        if download:
            self.download()

+        elif self.from_local:


torchvision/datasets/mnist.py

@@ -36,15 +36,20 @@ class MNIST(data.Dataset):
    training_file = 'training.pt'
    test_file = 'test.pt'

-    def __init__(self, root, train=True, transform=None, target_transform=None, download=False):
+    def __init__(self, root, train=True, transform=None, target_transform=None,
+                 download=False, from_local=False):


* [BERT] fixing input pipeline and layer norm namescope * [BERT] input preprocessing for tf2 model, seperating training and eval sets * [BERT] small corrections to README.md * Shell script to download and extract, V0.7 README * Minor fix to README * [BERT] small fixes to tpu library imports * [BERT] Gradient accumulation for TF1 * Fixing git SHA and nltk versions * [BERT] redirect seperated dataset changes to TF1 * [BERT] revert back run commands * [BERT] uncommand process_wiki.sh * [BERT] remove 3M starting eval requirement and update target accuracy to 0.720 * [BERT] delete unused files * [BERT] add MLPerf logging * [BERT] small corrections * [BERT] update eval frequency * [BERT] provides dataset after preprocessing, and move the related details to dataset.md * [BERT] small corrections * [BERT] update HPs for BS24 on V100x8, and add BS8k running steps on TPUs * [BERT] update README to give more details about how to use eval * [BERT] Readme update for eval * Revert "Merge pull request pytorch#1 from aarti-cerebras/v0.7_readme" This reverts commit 9974ac9d6d6bf0b3ceaf22d4d86c1f5f25ba26e4, reversing changes made to 563be596bd7f5c38db696e9c34ef29bd477462f7. * Revert "Revert "Merge pull request pytorch#1 from aarti-cerebras/v0.7_readme"" This reverts commit f38ef1626517ae74a3be2895861ee17ee0d699c9. * [BERT] add clip_by_global_norm_after_gradient_allreduce option * [BERT] sample script to run offline eval Co-authored-by: Aarti Ghatkesar <[email protected]>

sjoke2020 added 2 commits March 20, 2018 21:54

add a way to read four data files from local path

dbdacd4

modify

d71d857

fmassa requested changes Mar 21, 2018

View reviewed changes

fmassa added the awaiting response label Apr 4, 2018

malfet deleted the branch pytorch:master September 20, 2021 14:34

malfet closed this Sep 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a way to read data files from local path #451

add a way to read data files from local path #451

sjoke commented Mar 20, 2018

fmassa left a comment

This comment was marked as off-topic.

This comment was marked as off-topic.

add a way to read data files from local path #451

add a way to read data files from local path #451

Conversation

sjoke commented Mar 20, 2018

fmassa left a comment

Choose a reason for hiding this comment

This comment was marked as off-topic.

This comment was marked as off-topic.