Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install failing from the master: Failed building wheel #2480

Open
jcwchen opened this issue Jul 16, 2020 · 21 comments
Open

Install failing from the master: Failed building wheel #2480

jcwchen opened this issue Jul 16, 2020 · 21 comments

Comments

@jcwchen
Copy link

jcwchen commented Jul 16, 2020

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

1.pip install -q git+https://github.com/pytorch/vision.git

Expected behavior

Build successfully without any error message

Environment

  • PyTorch / torchvision Version (e.g., 1.0 / 0.4.0): master
  • OS (e.g., Linux): ubuntu16.04
  • How you installed PyTorch / torchvision (conda, pip, source): pip
  • Build command you used (if compiling from source): pip install -q git+https://github.com/pytorch/vision.git
  • Python version: 3.6
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information: clang7

Additional context

It occurs a week ago and keeps failing. The error message is as follows:

+ pip install -q git+https://github.com/pytorch/vision.git
  Failed building wheel for torchvision
Command "/tmp/venv/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-7wrzlist/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-bbtn3535/install-record.txt --single-version-externally-managed --compile --install-headers /tmp/venv/include/site/python3.6/torchvision" failed with error code 1 in /tmp/pip-req-build-7wrzlist/

I suppose it's a bug from the master branch? If so, when can it be fixed?
If not, how can I fix this?
Thanks.

@pmeier
Copy link
Collaborator

pmeier commented Jul 17, 2020

Hi @jcwchen, you left out the traceback which contains the information you need:

Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-req-build-vx375of5/setup.py", line 14, in <module>
        import torch
    ModuleNotFoundError: No module named 'torch'

torchvision depends on torch which has to available before you run the torchvision setup. The master of torchvision requires the master of torch (see compatibility table in the README). You can either install it from source (pip install https://github.com/pytorch/pytorch.git) or go with the pre-built nightly version instead.

@pmeier pmeier closed this as completed Jul 17, 2020
@jcwchen
Copy link
Author

jcwchen commented Jul 17, 2020

Hi @pmeier, Thank you for the fast reply.

I did install torch first, but I still encountered this error. The same build code worked a week ago.
Besides, if I pip install from an older version of torchvision (7/6) like:
pip install -q git+https://github.com/pytorch/vision.git@86b6c3e22e9d7d8b0fa25d08704e6a31a364973b
It can work, too. That's why I suppose it might be a bug from master branch.

BTW, where did you get that traceback information? (Knowing this could help debug this)
Thank you.

@pmeier
Copy link
Collaborator

pmeier commented Jul 17, 2020

I did install torch first, but I still encountered this error.

Huh, that is weird. I did a fresh install with

pip install numpy
pip install --pre torch -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html
pip install git+https://github.com/pytorch/vision.git

(remember to adapt the link to your CUDA version) and this is working fine. Could you try this and report your results?

Could you also run

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
python collect_env.py

and post the output? From what you have provided so far the only significant difference between our systems is that I used gcc and not clang. @fmassa Anything comes to mind that could have caused this?

BTW, where did you get that traceback information? (Knowing this could help debug this)

That is simply my output of pip install git+https://github.com/pytorch/vision.git when torch is not installed.

@jcwchen
Copy link
Author

jcwchen commented Jul 17, 2020

The output from collect_env.py is here:

2020-07-17 19:16:54 (86.0 MB/s) - 'collect_env.py' saved [13234/13234]

+ python collect_env.py
Collecting environment information...
PyTorch version: 1.7.0a0+7eb71b4
Is debug build: No
CUDA used to build PyTorch: None

OS: Ubuntu 16.04.6 LTS
GCC version: Could not collect
CMake version: version 3.5.1

Python version: 3.6
Is CUDA available: No
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA

Versions of relevant libraries:
[pip3] numpy==1.19.0
[pip3] torch==1.7.0a0+7eb71b4
[conda] Could not collect

Originally, I pip install pytorch from their master branch.
I tried your build code, but it still encountered the same error.

I don't have CUDA in my machine and I saw there are some changes in recent PRs of pytorch/vision related to CUDA, will it be an issue?

@pmeier
Copy link
Collaborator

pmeier commented Jul 17, 2020

Just to recap:

  • You've installed torch from source and it is working as expected
  • With this setup you can install 86b6c3e22e9d7d8b0fa25d08704e6a31a364973b without errors
  • Installing from the current master fails

Did I miss anything?


#2388 might be the offender (Cc @andfoy). It added the following to the README:

libpng and libjpeg must be available at compilation time in order to be available. Make sure that it is available on the standard library locations, otherwise, add the include and library paths in the environment variables TORCHVISION_INCLUDE and TORCHVISION_LIBRARY, respectively.

@jcwchen Could you check if libpng and libjpeg are available?

apt list --installed | grep -E "lib(png|jpeg)"

@andfoy
Copy link
Contributor

andfoy commented Jul 17, 2020

Hi @jcwchen, could you please try to compile torchvision using pip install -vvv git+https://github.com/pytorch/vision.git@86b6c3e22e9d7d8b0fa25d08704e6a31a364973b or python setup.py develop and post the logs here?

@andfoy andfoy self-assigned this Jul 17, 2020
@jcwchen
Copy link
Author

jcwchen commented Jul 17, 2020

Just to recap:

  • You've installed torch from source and it is working as expected
  • With this setup you can install 86b6c3e22e9d7d8b0fa25d08704e6a31a364973b without errors
  • Installing from the current master fails

Did I miss anything?

#2388 might be the offender (Cc @andfoy). It added the following to the README:

libpng and libjpeg must be available at compilation time in order to be available. Make sure that it is available on the standard library locations, otherwise, add the include and library paths in the environment variables TORCHVISION_INCLUDE and TORCHVISION_LIBRARY, respectively.

@jcwchen Could you check if libpng and libjpeg are available?

apt list --installed | grep -E "lib(png|jpeg)"

The recap is correct. I will check whether libpng and libjpeg are available later. (The machine is a CI)

@jcwchen
Copy link
Author

jcwchen commented Jul 17, 2020

Hi @jcwchen, could you please try to compile torchvision using pip install -vvv git+https://github.com/pytorch/vision.git@86b6c3e22e9d7d8b0fa25d08704e6a31a364973b or python setup.py develop and post the logs here?

pip install -vvv git+https://github.com/pytorch/vision.git@86b6c3e22e9d7d8b0fa25d08704e6a31a364973b works without any error message.
I am trying python setup.py develop now.

@jcwchen
Copy link
Author

jcwchen commented Jul 17, 2020

@pmeier Here is the result of apt list --installed | grep -E "lib(png|jpeg)":

+ apt list --installed
+ grep -E 'lib(png|jpeg)'

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libjpeg-dev/now 8c-2ubuntu8 amd64 [installed,local]
libjpeg-turbo8/now 1.4.2-0ubuntu3.1 amd64 [installed,local]
libjpeg-turbo8-dev/now 1.4.2-0ubuntu3.1 amd64 [installed,local]
libjpeg8/now 8c-2ubuntu8 amd64 [installed,local]
libjpeg8-dev/now 8c-2ubuntu8 amd64 [installed,local]
libpng12-0/now 1.2.54-1ubuntu1.1 amd64 [installed,local]
libpng12-dev/now 1.2.54-1ubuntu1.1 amd64 [installed,local]

Besides, python setup.py develop failed. The error log is here:
log.txt

@pmeier
Copy link
Collaborator

pmeier commented Jul 18, 2020

Since the error is related is related to libpng I think we are on the right track. The log states

PNG found: True
libpng version: 1.2.54
libpng installed version is less than 1.6.0, disabling PNG support

so you can try to upgrade it. 1.6 is the default for Ubuntu 18.04.

@jcwchen
Copy link
Author

jcwchen commented Jul 19, 2020

I tried to install libpng 1.6 on my Ubuntu machine first, but it still encountered the same error:

libpng16-16/now 1.6.20-2 amd64 [installed,local]
+ pip install -q git+https://github.com/pytorch/vision.git
  Failed building wheel for torchvision
Command "/tmp/venv/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-be0irl6s/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-38ejzqcr/install-record.txt --single-version-externally-managed --compile --install-headers /tmp/venv/include/site/python3.6/torchvision" failed with error code 1 in /tmp/pip-req-build-be0irl6s/

@pmeier
Copy link
Collaborator

pmeier commented Jul 20, 2020

@jcwchen this error message is not helpful. This minimal output happens because you use the -q (quiet) flag. For the purpose of debugging, please replace it with -vvv (extra extra verbose).

@jcwchen
Copy link
Author

jcwchen commented Jul 20, 2020

Error message is here:

 [2/3] c++ -MMD -MF /tmp/pip-req-build-x6pynt8j/build/temp.linux-x86_64-3.6/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.o.d -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPNG_FOUND=1 -DJPEG_FOUND=1 -I/tmp/pip-req-build-x6pynt8j/torchvision/csrc -I/tmp/pip-req-build-x6pynt8j/torchvision/csrc -I/tmp/pytorch/torch/include -I/tmp/pytorch/torch/include/torch/csrc/api/include -I/tmp/pytorch/torch/include/TH -I/tmp/pytorch/torch/include/THC -I/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image -I/tmp/pytorch/torch/include -I/tmp/pytorch/torch/include/torch/csrc/api/include -I/tmp/pytorch/torch/include/TH -I/tmp/pytorch/torch/include/THC -I/usr/include/python3.6m -I/tmp/venv/include/python3.6m -c -c /tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.cpp -o /tmp/pip-req-build-x6pynt8j/build/temp.linux-x86_64-3.6/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=image -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++14
    FAILED: /tmp/pip-req-build-x6pynt8j/build/temp.linux-x86_64-3.6/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.o
    c++ -MMD -MF /tmp/pip-req-build-x6pynt8j/build/temp.linux-x86_64-3.6/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.o.d -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPNG_FOUND=1 -DJPEG_FOUND=1 -I/tmp/pip-req-build-x6pynt8j/torchvision/csrc -I/tmp/pip-req-build-x6pynt8j/torchvision/csrc -I/tmp/pytorch/torch/include -I/tmp/pytorch/torch/include/torch/csrc/api/include -I/tmp/pytorch/torch/include/TH -I/tmp/pytorch/torch/include/THC -I/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image -I/tmp/pytorch/torch/include -I/tmp/pytorch/torch/include/torch/csrc/api/include -I/tmp/pytorch/torch/include/TH -I/tmp/pytorch/torch/include/THC -I/usr/include/python3.6m -I/tmp/venv/include/python3.6m -c -c /tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.cpp -o /tmp/pip-req-build-x6pynt8j/build/temp.linux-x86_64-3.6/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=image -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++14
    In file included from /tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.cpp:13:
    In file included from /usr/include/png.h:321:
    /usr/include/pngconf.h:383:12: error: unknown type name '__pngconf'
               __pngconf.h__ in libpng already includes setjmp.h;
               ^
    /usr/include/pngconf.h:383:21: error: cannot use dot operator on a type
               __pngconf.h__ in libpng already includes setjmp.h;
                        ^
    /usr/include/pngconf.h:384:12: error: unknown type name '__dont__'
               __dont__ include it again.;
               ^
    /usr/include/pngconf.h:384:28: error: expected ';' after top level declarator
               __dont__ include it again.;
                               ^
    /tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.cpp:36:5: error: unknown type name 'png_const_bytep'; did you mean 'png_const_charp'?
        png_const_bytep ptr;
        ^~~~~~~~~~~~~~~
        png_const_charp
    /usr/include/pngconf.h:1333:31: note: 'png_const_charp' declared here
    typedef PNG_CONST char  FAR * png_const_charp;
                                  ^
    /tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/readpng_cpu.cpp:38:16: error: use of undeclared identifier 'png_const_bytep'
      reader.ptr = png_const_bytep(datap) + 8;
                   ^
    6 errors generated.
    [3/3] c++ -MMD -MF /tmp/pip-req-build-x6pynt8j/build/temp.linux-x86_64-3.6/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/image.o.d -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPNG_FOUND=1 -DJPEG_FOUND=1 -I/tmp/pip-req-build-x6pynt8j/torchvision/csrc -I/tmp/pip-req-build-x6pynt8j/torchvision/csrc -I/tmp/pytorch/torch/include -I/tmp/pytorch/torch/include/torch/csrc/api/include -I/tmp/pytorch/torch/include/TH -I/tmp/pytorch/torch/include/THC -I/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image -I/tmp/pytorch/torch/include -I/tmp/pytorch/torch/include/torch/csrc/api/include -I/tmp/pytorch/torch/include/TH -I/tmp/pytorch/torch/include/THC -I/usr/include/python3.6m -I/tmp/venv/include/python3.6m -c -c /tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/image.cpp -o /tmp/pip-req-build-x6pynt8j/build/temp.linux-x86_64-3.6/tmp/pip-req-build-x6pynt8j/torchvision/csrc/cpu/image/image.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=image -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++14
    ninja: build stopped: subcommand failed.
    Traceback (most recent call last):
      File "/tmp/pytorch/torch/utils/cpp_extension.py", line 1519, in _run_ninja_build
        env=env)
      File "/usr/lib/python3.6/subprocess.py", line 438, in run
        output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1

The build log is here:
build_5310_step_106_container_0.txt

Hope it helps. Thank you.

@andfoy
Copy link
Contributor

andfoy commented Jul 20, 2020

@jcwchen, what happens if you install the latest master version?

@jcwchen
Copy link
Author

jcwchen commented Jul 20, 2020

I just built one from the latest master version, but it still encountered the same error as my previous comment.

@jcwchen
Copy link
Author

jcwchen commented Jul 21, 2020

Hi @pmeier and @andfoy,
After apt install libturbojpeg, pip install git+https://github.com/pytorch/vision.git can be built successfully.
Thank you for your assistance.

@pmeier
Copy link
Collaborator

pmeier commented Jul 22, 2020

@jcwchen glad it worked out for you, but I think we should still debug the cause of this. The README states that libjpeg-turbo can be used as a replacement of libjpeg, but this shouldn't be required.

@andfoy Could it be that we need a minimum requirement for libjpeg as well? It seems the previously installed versions

libjpeg-dev/now 8c-2ubuntu8 amd64 [installed,local]
libjpeg-turbo8/now 1.4.2-0ubuntu3.1 amd64 [installed,local]
libjpeg-turbo8-dev/now 1.4.2-0ubuntu3.1 amd64 [installed,local]
libjpeg8/now 8c-2ubuntu8 amd64 [installed,local]
libjpeg8-dev/now 8c-2ubuntu8 amd64 [installed,local]

were not enough and they had to install libjpeg-turbo>2.

@pmeier pmeier reopened this Jul 22, 2020
@jainrahul1
Copy link

Any update on this I am also facing the same issue. Even after installing libjpeg-turbo-2.0.5 from source error is sill present.

@andfoy
Copy link
Contributor

andfoy commented Sep 16, 2020

@jainrahul1 does your error is related to libpng

@jainrahul1
Copy link

Yes error was related to libpng. However after re-installing libpng16-dev resolved it.

@fmassa
Copy link
Member

fmassa commented Sep 22, 2020

@andfoy do you think we might be missing some extra checks during compilation to validate that the user has the right versions of libjpeg / libpng?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants