-
Notifications
You must be signed in to change notification settings - Fork 2k
nvidia-caffe and nvidia-digits docker support for cuda8.0? #209
Comments
We will provide new CUDA 8.0 images eventually. In the meantime, see this comment |
@3XX0 Thanks! I missed that thread during search.. So once I built the nvidia/caffe with cuda8.0, how should I tweak the nvidia/digits? |
Once you have the caffe image the only thing you need to do is rebuild the digits one. You can change the FROM directive to point to your local caffe image. If you already tagged it with the same name (i.e. |
@3XX0 I'm stuck at error while running nvidia-docker/ubuntu-14.04/digits/4.0/Dockerfile: Step 6 : RUN apt-get update && apt-get install -y --no-install-recommends --force-yes torch7-nv=0.9.99-1+cuda8.0 graphviz gcc libhdf5-dev digits=$DIGITS_PKG_VERSION && rm -rf /var/lib/apt/lists/* after a couple of lines ......Fetched 22.2 MB in 29s (756 kB/s) I've tried:
Also tried using the original parameters "torch7-nv=0.9.99-1+cuda7.5" but nothing changes |
@kertansul you need to add this line: https://github.com/NVIDIA/nvidia-docker/blob/master/ubuntu-14.04/cuda/7.5/runtime/cudnn5/Dockerfile#L4 But be careful that if you install |
@flx42 hi, I add the line before Error 1: NO_PUBKEY F60F4B3D7FA2AF80 Error 2: I'm guessing this is happening because I'm mixing up cuda8.0 and cuda7.5 ... |
Test with the new images, they support CUDA 8.0 now. However, we don't have DIGITS 5.0 yet. |
Hi, I'm using a GTX1080 with nvidia-docker/digits and getting error message while running AlexNet:
relu2 needs backward computation.
conv2 needs backward computation.
pool1 needs backward computation.
norm1 needs backward computation.
relu1 needs backward computation.
conv1 needs backward computation.
label_val-data_1_split does not need backward computation.
val-data does not need backward computation.
This network produces output accuracy
This network produces output loss
Network initialization done.
Solver scaffolding done.
Starting Optimization
Solving
Learning Rate Policy: step
Iteration 0, Testing net (#0)
Ignoring source layer train-data
Test net output #0: accuracy = 0.0999041
Test net output #1: loss = 2.30515 (* 1 = 2.30515 loss)
Check failed: status == CURAND_STATUS_SUCCESS (201 vs. 0) CURAND_STATUS_LAUNCH_FAILURE
Checked the nvidia/digits github and it seems to be something related to cuda7.5:
NVIDIA/DIGITS#925
However, I wanted to use containerization for deep learning frameworks.
Will nvidia update the docker images for cuda8.0?
Or how could I build nvidia-caffe and nvidia-digits dockerfiles for cuda8.0?
The text was updated successfully, but these errors were encountered: