Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Trying to create tensor with negative dimension #225

Open
gaussiangit opened this issue Apr 28, 2020 · 6 comments
Open

RuntimeError: Trying to create tensor with negative dimension #225

gaussiangit opened this issue Apr 28, 2020 · 6 comments

Comments

@gaussiangit
Copy link

gaussiangit commented Apr 28, 2020

Tried with torchvision 0.5, 0.6
Also torch 1.4, 1.5
Can you tell me the problem ? It occurs in coco eval phase.
Also only happens with D7

@zylo117
Copy link
Owner

zylo117 commented Apr 28, 2020

pls provide more info

@gaussiangit
Copy link
Author

gaussiangit commented Apr 29, 2020

Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/miniconda3/envs/pytorch-nets/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/miniconda3/envs/pytorch-nets/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "train.py", line 79, in forward
imgs=imgs, obj_list=obj_list)
File "/home/miniconda3/envs/pytorch-nets/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/Yet-Another-EfficientDet-Pytorch/efficientdet/loss.py", line 153, in forward
0.5, 0.3)
File "/home/Yet-Another-EfficientDet-Pytorch/utils/utils.py", line 107, in postprocess
anchors_nms_idx = nms(transformed_anchors_per, scores_per[:, 0], iou_threshold=iou_threshold)
File "/home/miniconda3/envs/pytorch-nets/lib/python3.6/site-packages/torchvision/ops/boxes.py", line 33, in nms
return _C.nms(boxes, scores, iou_threshold)
RuntimeError: Trying to create tensor with negative dimension -1242957280: [-1242957280] (check_size_nonnegative at /opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/TensorFactories.h:64)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f865b0a8e37 in /home/miniconda3/envs/pytorch-nets/lib/python3.6/site-packages/torch/lib/libc10.so)

@zylo117 This happens when I set debug true while training. Otherwise it trains.

@zylo117
Copy link
Owner

zylo117 commented Apr 30, 2020

Yet-Another-EfficientDet-Pytorch/utils/utils.py", line 107

can you debug on this line

@gaussiangit
Copy link
Author

gaussiangit commented May 4, 2020

@zylo117 It happens only when the debug is True on larger models like d5, d6, d7. d0 is working fine with debug. The issue is also mentioned here. pytorch/vision#1705

I am using conda env with torch 1.4 and torchvision 0.5
Now it is out of memory error on the same line. I tried the minimum batch size. Also I am training on 4 GPUs.
Any recommendations ?

@zylo117
Copy link
Owner

zylo117 commented May 5, 2020

Maybe there is a bug in nms function.
Try:

  1. you can implement pytorch nms or numpy nms by manipulating tensor/array to do nms.
  2. set a higher threshold here, https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch/blob/master/efficientdet/loss.py#L174

@tmyoda
Copy link

tmyoda commented Feb 10, 2021

I added batched_nms function at utils/utils.py and it seems to be fine.
Also, delete the import of batch_nms.

# https://github.com/ponta256/fssd-resnext-voc-coco/blob/master/layers/box_utils.py#L245
def nms(boxes, scores, nms_thresh=0.5, top_k=200):
    boxes = boxes.cpu().numpy()
    scores = scores.cpu().numpy()
    keep = []
    if len(boxes) == 0:
        return keep
    x1 = boxes[:, 0]
    y1 = boxes[:, 1]
    x2 = boxes[:, 2]
    y2 = boxes[:, 3]
    area = (x2-x1)*(y2-y1)
    idx = np.argsort(scores, axis=0)   # sort in ascending order
    idx = idx[-top_k:]  # indices of the top-k largest vals


    while len(idx) > 0:
        last = len(idx)-1
        i = idx[last]  # index of current largest val
        keep.append(i)
  
        xx1 = np.maximum(x1[i], x1[idx[:last]])
        yy1 = np.maximum(y1[i], y1[idx[:last]])
        xx2 = np.minimum(x2[i], x2[idx[:last]])
        yy2 = np.minimum(y2[i], y2[idx[:last]])

        w = np.maximum(0, xx2-xx1)
        h = np.maximum(0, yy2-yy1)

        inter = w*h
        iou = inter / (area[idx[:last]]+area[i]-inter)
        idx = np.delete(idx, np.concatenate(([last], np.where(iou > nms_thresh)[0])))

    return np.array(keep, dtype=np.int64)

# https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py#L39
def batched_nms(
    boxes,
    scores,
    idxs,
    iou_threshold,
):

    if boxes.numel() == 0:
        return torch.empty((0,), dtype=torch.int64, device=boxes.device)
    else:
        max_coordinate = boxes.max()
        offsets = idxs.to(boxes) * (max_coordinate + torch.tensor(1).to(boxes))
        boxes_for_nms = boxes + offsets[:, None]
        keep = nms(boxes_for_nms, scores, nms_thresh=iou_threshold)
        return keep

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants