-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Illegal instruction (core dumped) with some pretrained models (but not all) #1782
Comments
Hi, Can you run the code under gdb and paste the output? # tst.py
import torchvision
torchvision.models.squeezenet1_0(pretrained=True) and then run from the command line
once it starts, run
and when you get the segfault, do
and paste the result here. It will help us identify the problem |
Hi, I had the same problem, and I solved it by compiling torch and torchvision myself. |
Hello, thank you for your response. I compiled torch and torchvision, disabling SSE4 support with -DENABLE_SSE4=0. But the problem remains. The backtrace is below: Thread 1 "python" received signal SIGILL, Illegal instruction. at ../csu/libc-start.c:310 #46 0x0000555555733b50 in _start () at ../sysdeps/x86_64/elf/start.S:103 (gdb) |
Hi @emericit Thanks for the backtrace, this seems to be a problem with PyTorch itself, and not with torchvision. Indeed, it seems that if you do something like import torch
torch.randn(10) you might have the segfault as well. This is actually very close to be a duplicate of pytorch/pytorch#22338, so let's redirect the discussion there. |
Hello @fmassa ! |
The randn codepath for AVX might only get activated for large-enough inputs, see https://github.com/pytorch/pytorch/blob/c2c835dd95f192d1397877b94e615d13258126d9/aten/src/TH/vector/AVX2.cpp#L79 and https://github.com/pytorch/pytorch/blob/64de93d8e7c6ba085997b18bcf85681b330d9afb/aten/src/TH/generic/THTensorRandom.cpp#L88-L90, so maybe try with something like
I think this should crash this time |
And about compiling without AVX2, I'm not sure, might be best to ask in the PyTorch issue I linked |
You're right, it crashes with size 1024. |
Hello,
I have a strange issue when loading some pretrained models. On loading, some models abort and give an "illegal instruction" message, like below:
But with other models, everything runs smoothly. The following code
outputs a nicely loaded model:
Running Python 3.7.5 on a CPU only Ubuntu machine with the following versions:
PyTorch Version: 1.4.0+cpu
Torchvision Version: 0.5.0+cpu
The CPUs have the following properties:
I could not find help anywhere with this issue...
The text was updated successfully, but these errors were encountered: