scatter_logsumexp: NaNs on untouched indices #368

aabbas90 · 2023-04-13T20:14:02Z

Hi,

I am trying to perform scatter_logsumexp on a strict subset of indices of the out tensor. I am getting NaNs at the indices where out is supposed to be untouched. Example:

import torch
from torch_scatter import scatter_logsumexp

src = torch.Tensor([0.0, 1.0, 4.0])
index = torch.tensor([1, 1, 4])
out = torch.zeros((6, ), dtype = torch.float32)

scatter_logsumexp(src, index, out = out)
print(out)
tensor([   nan, 1.5514,    nan,    nan, 4.0181,    nan]) # Only indices 1, 4 should be changed
print(torch_scatter.__version__)
'2.1.1+pt20cu118'

Another issue even if the NaN issue is resolved is about efficiency. We would ideally like to only operate those locations of out which are referred to in index. Otherwise for a very large sized out we are doing redundant calculations.

Thanks,
Ahmed

The text was updated successfully, but these errors were encountered:

rusty1s · 2023-04-14T08:41:24Z

Thanks for reporting. I fixed this in #369.

aabbas90 · 2023-04-14T08:48:12Z

Thanks for the quick fix. But there is an issue with backpropagation now:

import torch
from torch_scatter import scatter_logsumexp

src = torch.Tensor([0.0, 1.0, 4.0])
src.requires_grad = True
src.retain_grad()
index = torch.tensor([1, 1, 4])
out = torch.zeros((6, ), dtype = torch.float32)

scatter_logsumexp(src, index, out = out)

loss = torch.square(out - torch.ones_like(out)).sum()
loss.backward()
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [6]], which is output 0 of ExpBackward0, is at version 7; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
print(src.grad)

rusty1s · 2023-04-14T08:55:16Z

Yes, that's because we need to write in-place to out.

out = scatter_logsumexp(src, index)

should fix this.

yzhangcs · 2023-06-01T19:20:05Z

@rusty1s Hi, is there any pregress on how to cure backpropagation problems?

and I also wonder if there any plans to optimize scatter_logsumexp with cud? Directly log then sum then exp may cause many overflow/underflow issues.

rusty1s · 2023-06-02T05:27:59Z

I think this issue is only present if you pass in out, and there is not much I can do about this. It should work without specifying that. AFAIK, there are no numerical issues in our implementation, as we correctly compute exp by first subtracting by the max element.

rusty1s linked a pull request Apr 14, 2023 that will close this issue

Fix NaN treatment in logsumexp #369

Merged

rusty1s closed this as completed in #369 Apr 14, 2023

rasmushaugaard mentioned this issue Mar 15, 2024

logsumexp handling of edge-cases #426

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scatter_logsumexp: NaNs on untouched indices #368

scatter_logsumexp: NaNs on untouched indices #368

aabbas90 commented Apr 13, 2023 •

edited

Loading

rusty1s commented Apr 14, 2023

aabbas90 commented Apr 14, 2023

rusty1s commented Apr 14, 2023

yzhangcs commented Jun 1, 2023 •

edited

Loading

rusty1s commented Jun 2, 2023

scatter_logsumexp: NaNs on untouched indices #368

scatter_logsumexp: NaNs on untouched indices #368

Comments

aabbas90 commented Apr 13, 2023 • edited Loading

rusty1s commented Apr 14, 2023

aabbas90 commented Apr 14, 2023

rusty1s commented Apr 14, 2023

yzhangcs commented Jun 1, 2023 • edited Loading

rusty1s commented Jun 2, 2023

aabbas90 commented Apr 13, 2023 •

edited

Loading

yzhangcs commented Jun 1, 2023 •

edited

Loading