Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding GPU functionality, enabling data loaders/tuples as input #38

Merged
merged 38 commits into from
Oct 12, 2018

Conversation

jdunnmon
Copy link
Contributor

@jdunnmon jdunnmon commented Sep 2, 2018

  • Added cuda functionality in Classifier and EndModel

  • Modifies Classifier and EndModel to give user option to use either DataLoader or tuple as input

  • Refactors to support the above + small bugfixes (import typos in contrib, etc.)

  • Adds option for training progress bar for EndModel

  • Updates multitask class structure to leverage the above

  • Update tests

@jdunnmon jdunnmon changed the base branch from master to dev September 2, 2018 10:06
@jdunnmon
Copy link
Contributor Author

jdunnmon commented Sep 2, 2018

@ajratner @bhancock8 first working cut at adding cuda/dev loader support (have tested it preliminarily on one of my applications). Comments welcome, tried to minimally change the interface (particularly to not change the interface to predict and predict_proba). Chris suggested intelligently linking to embeddings, which I can either add in here or do in the future. Re: tests, I'll see what breaks and update them for dev, o/w lmk if there's any specific ones you'd want to add.

@ajratner
Copy link
Contributor

ajratner commented Sep 6, 2018

@jdunnmon Can we merge this into master instead? I think we should use dev for major changes to the system, whereas I think we should get smaller functionality additions like this one (even if they change the interface a bit) into master as quickly as possible so people can start using them!

@bhancock8 thoughts on this?

@jdunnmon once you've done this & tests are passing, tag me here so I can review then? Thanks!

@jdunnmon
Copy link
Contributor Author

jdunnmon commented Sep 6, 2018

Yeah, I can prep the PR for master. I'll need to change the tests, etc., but that's on the to do list anyway, just a bit more important to make sure everything's locked down before moving it into master. I'll get that ready.

@ajratner
Copy link
Contributor

ajratner commented Sep 6, 2018

Cool! I also suggest bc looks like this PR is already pulling in changes from master into dev, so will probably be cleaner anyway!

Copy link
Contributor

@bhancock8 bhancock8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to make these same changes for the MultitaskClassifier? Any reason why we wouldn't?

I like the cuda changes. Thanks for adding those.

So the philosophy now is that they either pass in a tuple or a dataloader and we make a dataloader, then internally our train method always gets dataloaders, right?

My main concern with this PR is in the comment on the evaluate method; not sure we need a separate method to essentially just do some pre-formatting and then predict. (Though either way, it makes sense to have a batch[method name] version, of course).


else:
raise ValueError(
"Unrecognized input data structure, use tuple or DataLoader!"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need an exclamation point. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol, sounds good. Will update

@jdunnmon
Copy link
Contributor Author

@bhancock8 couple thoughts:

(a) can go back and add this in for MultitaskClassifier as well
(b) yes, they pass either a dataloader or a tuple -- if a tuple, we create a dataloader under the hood
(c) The evaluate method was created in order to maintain the API for predict while allowing us to handle batch evaluation, tiebreaking, etc. It seemed messy to have all of that structure for batch vs. non-batch evaluation within score as well as within the cross-validation portion of the training loop. This was also partially done in order to play nicely with higher-level classes so that Classifier doesn't change very much.

@ajratner thoughts on (c)?

@bhancock8
Copy link
Contributor

The more I look at this, the more convinced I am that we don't need a new pair of methods here which duplicate a lot of the functionality of existing methods. I think we just need to add a batch_size kwarg to Classifier.score(), which it will pass down to predict(). And then inside predict() we can either call predict_proba() once or in batches. So any models overwriting predict_proba() don't need to worry about batches, and score just passes it down to predict. What do you think about that?

@bhancock8
Copy link
Contributor

bhancock8 commented Sep 15, 2018

As I've looked at it a little more, I see the issue: if they give us a dataloader with X and Y in score(), then we need to somehow split it before passing to predict(), since it expects only Xs. Here's what I suggest then:

--score() accepts either (X, Y) or a DataLoader. If it's (X, Y), then first thing we make it into a DataLoader. For each batch, it calls predict, stores the results, and then once we have all of Y_p and Y, we calculate metrics once.
--predict() accepts either X or a DataLoader. And again, first thing we convert into a DataLoader, then for each batch, call predict_proba() and do tie-breaking.
--So internally, we're always using DataLoaders (consistent with the train method); we just offer the tuple signature for convenience. And people can do various things with their DataLoaders if they so choose--shuffling, parallel loading, batching, etc.--and we don't care. No need for a batch_size kwarg; if they need batching, they'll submit a DataLoader. Let's have more handled by other people's classes rather than our own. :)

@ajratner
Copy link
Contributor

@bhancock8 agreed in full!! @jdunnmon ping me when adjustments made and I'll do a more thorough look then!

@jdunnmon
Copy link
Contributor Author

@bhancock8 @ajratner thanks for the thoughts! think that will work, i'll put in these changes (and update MultitaskClassifier) as soon as I can.

@bhancock8
Copy link
Contributor

bhancock8 commented Sep 17, 2018 via email

@jdunnmon jdunnmon changed the base branch from dev to master October 10, 2018 19:50
@ajratner
Copy link
Contributor

ajratner commented Oct 12, 2018

@jdunnmon @bhancock8 Done with my pass here! Take a look when you get a chance?

Only remaining things that would be nice to have:

  • A test subdirectory for simple GPU functionality testing (not to be run on Travis, but so that a user can check themselves)
  • Nicer way to handle putting our loss function on CUDA?

@ajratner
Copy link
Contributor

Oh also, my tests keep failing due to make check, even though this passes locally (same as JD was seeing). @bhancock8 any ideas here?

@ajratner
Copy link
Contributor

@bhancock8 @jdunnmon NVM, I'm pretty sure the formatting errors are coming from master. So should be fine to merge. @bhancock8 when you're up in a few hours, let us know if you can quickly review and approve the changes?

@jdunnmon
Copy link
Contributor Author

jdunnmon commented Oct 12, 2018

@bhancock8 , ready for your eyes. @ajratner and I went through today and made a bunch of changes to update the above and added a test for successful GPU usage. Alex's _get_predictions function reuses the code I was trying to make sure we could repurpose in evaluate in a more elegant way. Go ahead and give it a look -- want to make any fixes tomorrow morning and get this in!

@ajratner I confirmed that the GPU test runs on my GPU machine and fails on my local mac due to no CUDA existing, so seems like it's doing what it should. Give it a spin tomorrow. If I've used this decorator right, Travis should ignore this test...we'll know soon.

Copy link
Contributor

@bhancock8 bhancock8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, made a few more comments. I'm going to mark 'approve' so you're not stuck waiting for me on the merge, but do take a look at and address the few suggestions. Nice work!

When you do the merge, move the marker up to v0.2.0, baby!

metal/end_model/loss.py Outdated Show resolved Hide resolved
metal/multitask/mt_classifier.py Outdated Show resolved Hide resolved
metal/utils.py Outdated
@@ -365,3 +365,13 @@ def slice_data(data, indices):
return outputs[0]
else:
return outputs


def mt_to_cuda(data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a better name here. I didn't now what "mt" meant. I mean, I assumed multitask, but what does it mean to convert a 'multitask' to cuda?

Can we also add a docstring here saying what the expected type of "data" is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this current function won't cover X that comes in as a list, right? Will clean up now

tests/gpu/test_gpu.py Show resolved Hide resolved
metal/classifier.py Outdated Show resolved Hide resolved
@bhancock8
Copy link
Contributor

One more thought: how can the formatting issues be the fault of master @ajratner when it's passing the tests just fine? I think it's something in our additions. Is there a more verbose setting you can use to have it print out what exactly it's upset about or would be changing?

@ajratner
Copy link
Contributor

@bhancock8 I think it's because formatting is broken on master

@ajratner
Copy link
Contributor

Passes GPU tests as well!

@ajratner ajratner merged commit a2bb278 into master Oct 12, 2018
@ajratner ajratner deleted the dev_cuda_loader branch October 12, 2018 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants