-
Notifications
You must be signed in to change notification settings - Fork 6.8k
oneDNN 2 missing headerfiles #19690
Comments
Related code on MXNet side is at https://github.com/apache/incubator-mxnet/blob/91503f71e17fca9151779503fc9f5edefe26f2ef/tools/pip/setup.py#L148-L150 |
I've met this error too. @bartekkuncer Is it possible to add the header to pip wheel? Also ping @szha |
In terms of the wheel, I think the last wheel that works is 20201214. |
@leezu I believe that changing L149 to: |
@bartekkuncer For me, I met this error when trying to install horovod (you may change the cuda version): python3 -m pip install -U --pre "mxnet-cu102==2.0.0b20201217" -f https://dist.mxnet.io/python
HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_WITHOUT_GLOO=1 HOROVOD_WITH_MPI=1 HOROVOD_WITH_MXNET=1 HOROVOD_WITHOUT_PYTORCH=1 HOROVOD_WITHOUT_TENSORFLOW=1 python3 -m pip install --no-cache-dir horovod |
@bartekkuncer I recommend the following for verification:
|
Thanks @bartekkuncer! I opened #19694 as weekend has started in your timezone |
Horovod now fails with
|
Yes, I saw that, working on the fix. |
Thanks @bartekkuncer! |
@bartekkuncer looks like the |
The files included in are
I think we may need to update https://github.com/apache/incubator-mxnet/blob/3c5beb3596b6bc01f77bc7ddd14ed90221c31950/cd/mxnet_lib/static/Jenkins_pipeline.groovy#L36 to ensure that the config files are stashed correctly on the CD |
we might also consider making it robust in setup.py by asserting the existence of these header files instead of only include when available. |
Thanks @leezu . I changed it in CI but must have overlooked it in CD. |
It looks like there are still more issues with the CD. Horovod still fails with
|
Yes, I saw that. #19726 should fix the issue. |
Thank you @bartekkuncer! |
#19667 breaks Horovod (cf horovod/horovod#2530) as some header files are missing in the pip wheel:
/usr/local/lib/python3.6/dist-packages/mxnet/include/mkldnn/oneapi/dnnl/dnnl.hpp:23:10: fatal error: oneapi/dnnl/dnnl_config.h: No such file or directory
cc @bartekkuncer
The text was updated successfully, but these errors were encountered: