[Bug] HAP transform crashes when using a GPU #1047

burn2l · 2025-02-13T17:15:37Z

Search before asking

I searched the issues and found no similar issues.

Component

Transforms/Other

What happened + What you expected to happen

When the HAP transform is run on a machine with a GPU it crashes with:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)

Reproduction script

Run pytest test/test_hap.py on a machine with a GPU

Anything else

Can be fixed by changing line 41 in dpk_hap/transform.py to:

self.model = AutoModelForSequenceClassification.from_pretrained(self.model_name_or_path).to(device)

OS

Red Hat Enterprise Linux (RHEL)

Python

3.11.x

Are you willing to submit a PR?

Yes I am willing to submit a PR!

shahrokhDaijavad · 2025-02-14T20:37:11Z

cc: @klwuibm

klwuibm · 2025-02-14T20:46:11Z

I think we can add the following block at line 42:


if torch.cuda.is_available():
   device = torch.device(f"cuda")
   self.model.to(device)

@shahrokhDaijavad

shahrokhDaijavad · 2025-02-14T21:30:10Z

Thank you, @klwuibm.

@ian-cho Can you please try what @klwuibm suggests? There is also issue #1048 and a suggested fix by @burn2l on that one, so if you feel comfortable, please submit a PR with these 2 changes. Thanks.
cc: @agoyal26 @touma-I

burn2l · 2025-02-14T22:45:06Z

I tested the fix of always executing .to(device) and it worked for both CPU & GPU ... matching what is done for inputs on line 54.

klwuibm · 2025-02-17T15:58:13Z

@burn2l Thanks very much for raising this issue. I believe that one can always execute model.to(device), if device is defined properly beforehand. Namely,

if torch.cuda.is_available():
   device = torch.device(f"cuda")
else:
   device = torch.device(f"cpu")
self.model.to(device)

cc: @shahrokhDaijavad @touma-I

burn2l · 2025-02-17T16:29:37Z

The code already sets 'device' on line 24 and uses it on 'inputs' with .to(device) at line 54, so to be consistent .to(device) should be added to self.model. Having an if statement for just the model would be slightly confusing since it would suggest that one style is more appropriate than the other. A minor point but I think it would help readability.

klwuibm · 2025-02-17T16:57:52Z

yes. @burn2l, you are absolutely right. The device already is set on line 24. So, we can safely add model.to(device).

ian-cho · 2025-02-18T07:38:01Z

@ian-cho Can you please try what @klwuibm suggests? There is also issue #1048 and a suggested fix by @burn2l on that one, so if you feel comfortable, please submit a PR with these 2 changes. Thanks.

Thanks. I submitted a PR that added .to(device) to the model.

burn2l added the bug Something isn't working label Feb 13, 2025

ian-cho mentioned this issue Feb 18, 2025

Update transform.py #1056

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] HAP transform crashes when using a GPU #1047

[Bug] HAP transform crashes when using a GPU #1047

burn2l commented Feb 13, 2025

shahrokhDaijavad commented Feb 14, 2025

klwuibm commented Feb 14, 2025

shahrokhDaijavad commented Feb 14, 2025

burn2l commented Feb 14, 2025

klwuibm commented Feb 17, 2025

burn2l commented Feb 17, 2025

klwuibm commented Feb 17, 2025

ian-cho commented Feb 18, 2025

[Bug] HAP transform crashes when using a GPU #1047

[Bug] HAP transform crashes when using a GPU #1047

Comments

burn2l commented Feb 13, 2025

Search before asking

Component

What happened + What you expected to happen

Reproduction script

Anything else

OS

Python

Are you willing to submit a PR?

shahrokhDaijavad commented Feb 14, 2025

klwuibm commented Feb 14, 2025

shahrokhDaijavad commented Feb 14, 2025

burn2l commented Feb 14, 2025

klwuibm commented Feb 17, 2025

burn2l commented Feb 17, 2025

klwuibm commented Feb 17, 2025

ian-cho commented Feb 18, 2025