-
Notifications
You must be signed in to change notification settings - Fork 6.8k
MXNet USE_SSE=1 build uses AVX instruction set #14664
Comments
I also tried upgrading to MXNet 1.4.0, but got the same "Illegal instruction" problem. |
Why do you install the GPU version in a machine without GPU? It seems that the program crashes since CUDA library files not found. If you want to install the GPU version, could you please provide the stacktrace? gdb python |
The GPU libraries are available and in my path. If I unload them, I get a different message from the linker. The reason is that I am submitting jobs to a cluster. I have one job that uses MXNet but just for data preparation so it doesn't need a GPU. It's a pain to have to maintain separate environments. As long as the libraries are found, it seems MXNet should be able to work smoothly with a CPU device, instead of a GPU one? Plus, this used to work fine. |
I will test it. I have the environment with CUDA 10 and without GPU. |
Thank you! |
Sorry that the version of CUDA in my machine is 10.1.105. It seems that there is no any related pre-build |
@mjpost just as a quick sanity check, does it work if you install from nightly build? (i.e. |
I get the same error with the nightly build: $ module list
Currently Loaded Modules:
1) shared 2) StdEnv 3) dot 4) uge/8.6.4 5) default-environment 6) cuda10.0/toolkit/10.0.130 7) gcc/5.4.0 8) cudnn/7.5.0_cuda10.0
$ pip install -U --pre mxnet-cu100mkl
Collecting mxnet-cu100mkl
Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)",)': /packages/c7/c2/06986f51da6052f4fa673ed1c5f55842c3f216b5e1b8d25bb59a588e8161/mxnet_cu100mkl-1.5.0b20190412-py2.py3-none-manylinux1_x86_64.whl
Downloading https://files.pythonhosted.org/packages/c7/c2/06986f51da6052f4fa673ed1c5f55842c3f216b5e1b8d25bb59a588e8161/mxnet_cu100mkl-1.5.0b20190412-py2.py3-none-manylinux1_x86_64.whl (554.8MB)
100% |████████████████████████████████| 554.8MB 17kB/s
Requirement already satisfied, skipping upgrade: graphviz<0.9.0,>=0.8.1 in ./.conda/envs/cu100/lib/python3.6/site-packages (from mxnet-cu100mkl) (0.8.4)
Requirement already satisfied, skipping upgrade: numpy<1.15.0,>=1.8.2 in ./.conda/envs/cu100/lib/python3.6/site-packages (from mxnet-cu100mkl) (1.14.6)
Requirement already satisfied, skipping upgrade: requests>=2.20.0 in ./.conda/envs/cu100/lib/python3.6/site-packages (from mxnet-cu100mkl) (2.21.0)
Requirement already satisfied, skipping upgrade: urllib3<1.25,>=1.21.1 in ./.conda/envs/cu100/lib/python3.6/site-packages (from requests>=2.20.0->mxnet-cu100mkl) (1.24.1)
Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in ./.conda/envs/cu100/lib/python3.6/site-packages (from requests>=2.20.0->mxnet-cu100mkl) (2019.3.9)
Requirement already satisfied, skipping upgrade: chardet<3.1.0,>=3.0.2 in ./.conda/envs/cu100/lib/python3.6/site-packages (from requests>=2.20.0->mxnet-cu100mkl) (3.0.4)
Requirement already satisfied, skipping upgrade: idna<2.9,>=2.5 in ./.conda/envs/cu100/lib/python3.6/site-packages (from requests>=2.20.0->mxnet-cu100mkl) (2.8)
Installing collected packages: mxnet-cu100mkl
Found existing installation: mxnet-cu100mkl 1.4.0.post0
Uninstalling mxnet-cu100mkl-1.4.0.post0:
Successfully uninstalled mxnet-cu100mkl-1.4.0.post0
Successfully installed mxnet-cu100mkl-1.5.0b20190412
$ python
Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
Illegal instruction
$ |
But note that I am trying to stick with MXNet 1.3.1, since I have a branch of research code using software (sockeye) that isn't upgraded to 1.4 yet. |
@mjpost the only difference between the mxnet-cu92* and mxnet-cu100* would be in the cuda/cudnn library. @DickJC123 do you know if the cuda libraries started to use any new instruction sets? |
Could it have to do with the AVX instruction set? I found a few issues where people mention this, but I don't perfectly understand them. Note that I haven't tried building from source yet. I hope to do that this weekend. |
@mjpost mxnet pre-built binaries require AVX2 instruction set. However, this is true for all packages and not just cu100. So it must be something else. |
Same problem here. Any updates on this? |
I'm getting the same error |
@k128 please provide information about your cpu. Ie output of |
processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 processor : 6 processor : 7 |
Thanks @k128. Your CPU doesn't support AVX instruction set, but the binary package you obtained via Also, please provide the output of You can find build from source instructions at https://mxnet.apache.org/get_started/ubuntu_setup |
|
Thank you. Please let me know if you face any issues with the source compiled version of mxnet (https://mxnet.apache.org/get_started/ubuntu_setup) |
I might be doing something completely wrong here, but I installed MKL via APT without error and used the code on this page in order: https://mxnet.apache.org/api/python/docs/tutorials/performance/backend/mkldnn/mkldnn_readme.html Exact code I ran:
|
@k128 please post the complete output of the |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I can reproduce this issue on AWS
Given the nature of this problem it's likely due to Further the use of |
@mjpost could you verify if |
@k128 @mjpost could you try installing https://lausen-public.s3.amazonaws.com/mxnet_cu100-1.6.0b20200127-py2.py3-none-manylinux1_x86_64.whl and report if it works? The library is built using #17448. Setting |
On my desktop (with no GPU), with CUDA 10 libraries loaded, when I attempt to
import mxnet
in Python, I get the following error:But everything works fine in another conda environment with
mxnet-92mkl
and CUDA 9.2 libraries loaded:Is there any advice how to fix this? I cannot find a similar issue filed.
The text was updated successfully, but these errors were encountered: