(For the administrator instruction, please visit here.)
(Important) Before making any bookings, please contact your supervisor for approval. If you require space in the lab, you have the option to also book one of the hot desks through the same spreadsheet.
Note: You must ensure that the computers are booked for the duration you expect your training to be conducted. Any docker containers running outside of a booking are subject to being terminated if they interfere with future bookings.
The robotics lab houses these workstations. You can utilise these machines on-site if you have access to the lab.
Name | IP | OS | GPU Driver | CUDA | GPU | GPU Mem |
---|---|---|---|---|---|---|
P6000-1 (UOA370131) | 130.216.238.99 | Ubuntu 22.04 | 535.54.03 | 12.2 | Quadro P6000 | 24576MiB |
P6000-2 (UOA370132) | 130.216.239.69 | Ubuntu 22.04 | 525.125.06 | 12.0 | Quadro P6000 | 24576MiB |
P6000-3 (UOA370133) | 130.216.239.173 | Ubuntu 22.04 | 510.108.03 | 11.6 | Quadro P6000 | 24576MiB |
P6000-4 (UOA370142) | 130.216.238.182 | Ubuntu 22.04 | 525.125.06 | 12.0 | Quadro P6000 | 24576MiB |
Please refer to here
To connect to the workstations remotely, you will need to use an SSH client, such as Terminal on Linux/MacOS or Command Prompt on Windows, on your local machine. Your local machine must be connected to the UoA network via Wi-Fi or Ethernet cable. If you are off-campus, you will need to connect to the UoA's VPN. If your Windows OS does not have OpenSSH installed, you can find instructions on how to add it here. Alternatively, you can use other ssh clients, such as PuttySSH.
To obtain an account for the workstations, please reach out to your supervisor. Once you have an active booking for the workstation, open your terminal and enter the following command.
# You should execute the following command on your local machine.
ssh username@IP-address-of-workstation
To connect to SSH without a password, you will need to set up SSH key authentication between the client machine (where you are connecting from) and the server machine (where you are connecting to). Here are the general steps to follow:
- Generate an SSH key pair on the client machine. This can be done using the following command: (This will generate a public key file (usually named id_rsa.pub) and a private key file (usually named id_rsa) in the ~/.ssh/ directory.)
# You should execute the following command on your local machine.
ssh-keygen
- Copy the public key to the server machine. You can do this using the ssh-copy-id command, which will copy the public key to the server's authorized keys file: (This will prompt you for the server password, and then add your public key to the server's ~/.ssh/authorized_keys file.)
# You should execute the following command on your local machine.
ssh-copy-id username@IP-address-of-workstation
- Test the connection. You should now be able to connect to the server without a password by running: (This should log you into the server without prompting for a password.)
# You should execute the following command on your local machine.
ssh username@IP-address-of-workstation
Using FileZila Client:
-
Download and install FileZilla on your local machine from the official website.
-
Open FileZilla and click on the "Site Manager" button.
-
In the Site Manager, click on the "New Site" button and enter the following details:
- Host: IP address or hostname of the remote machine
- Protocol: SFTP (SSH File Transfer Protocol)
- Logon Type: Normal
- User: Your username on the remote machine
- Password: Your password on the remote machine
-
Click on the "Connect" button to connect to the remote machine.
-
Once connected, you will see two panels in FileZilla. The left panel shows the files on your local machine, while the right panel shows the files on the remote machine.
-
To transfer files, simply drag and drop them from one panel to the other.
-
You can monitor the progress of file transfers in the "Queued files" tab.
That's it! You can use FileZilla to transfer files between your local machine and the remote machine easily and securely.
Using SCP
- Open your terminal or command prompt.
- Type the following command to transfer a file from your local machine to the remote machine:
# You should execute the following command on your local machine.
scp /path/to/local/file username@remote:/path/to/remote/directory
Replace /path/to/local/file with the path to the local file you want to transfer, username with your username on the remote machine, remote with the IP address or hostname of the remote machine, and /path/to/remote/directory with the path to the remote directory where you want to transfer the file.
- Enter your password when prompted, and the file will be transferred.
That's it! You can use similar commands to transfer files from the remote machine to your local machine or to transfer directories and their contents.
-
Locate the Container ID or Name:
- Before copying anything, you need to know the ID or name of the Docker container you want to copy files to. You can list all running containers with the command:
docker ps
- This command will display a list of all active containers along with their IDs and names.
- Before copying anything, you need to know the ID or name of the Docker container you want to copy files to. You can list all running containers with the command:
-
Use the
docker cp
Command:- The general syntax for the
docker cp
command is:docker cp <source-path> <container-id>:<destination-path>
- Here,
<source-path>
is the path to the file or folder on your local Ubuntu machine,<container-id>
is the ID or name of your Docker container, and<destination-path>
is the path inside the container where you want to copy the file or folder.
- The general syntax for the
-
Example:
- Suppose you have a folder named
myfolder
in your home directory (/home/username/myfolder
) and you want to copy it to a container with the ID12345abcde
into a directory/app
inside the container. The command would be:docker cp /home/username/myfolder 12345abcde:/app
- This command copies
myfolder
into the/app
directory of the container.
- Suppose you have a folder named
-
Verify the Copy:
- To ensure that the file or folder has been copied successfully, you can execute a command inside the container. For example:
docker exec -it 12345abcde ls /app
- This command will list the contents of the
/app
directory inside the container, where you should see your copied folder or file.
- To ensure that the file or folder has been copied successfully, you can execute a command inside the container. For example:
-
Handling Permissions:
- Sometimes, you might face permission issues depending on how the Docker container is set up. Ensure that the destination directory inside the container has the appropriate permissions for the operation.
-
Copying Files from Container to Host:
- If you need to copy files in the opposite direction (from the container to your host system), you can reverse the source and destination in the
docker cp
command.
- If you need to copy files in the opposite direction (from the container to your host system), you can reverse the source and destination in the
Remember that the Docker container must be running for the docker cp
command to work. If the container is not running, you'll need to start it first using docker start <container-id>
.
The '/home/$USER/data' directory on the workstation is intended for sharing data between the workstation and containers. This directory is already mapped to a Docker volume, allowing the data to be accessed from Docker containers. You can use this directory to share training dataset between the workstation and containers. To use this volume, add the following options when starting a container:
--mount source=datastore,target=/data
To upload your training dataset to the workstation, use FileZilla or scp to transfer the files to the '/home/$USER/data' directory. Once the data is uploaded, you can access it from the '/data' directory within the container.
For instance, you can start a container by following the command below.
# You should execute the following command on the workstation.
docker run --rm -it --runtime=nvidia -v /dev/shm:/dev/shm --mount source=datastore,target=/data nvidia/cuda:11.6.2-devel-ubuntu20.04 /bin/bash
Upload any data to the '/home/$USER/data' directory, and you should be able to see those files in the '/data' folder within the container.
# You should execute the following command within the container.
ls /data
Please visit the official docs to see all the docker commands and their options.
To incorporate GPU resources within the container, include the following options. For more options, please refer to here.
--privileged --runtime==nvidia -v /dev/shm:/dev/shm -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_COMPABILITIES=all
For instance, you can start a container by following the command below.
# You should execute the following command on the workstation.
docker run --rm -it --privileged --runtime=nvidia \
-v /dev/shm:/dev/shm \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_COMPABILITIES=all \
nvidia/cuda:11.6.2-devel-ubuntu20.04 \
/bin/bash
To confirm whether your container has access to the GPU resources on the host machine, you can execute the command below:
# You should execute the following command within the container.
nvidia-smi
The expected output/messages should be as follows.
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.108.03 Driver Version: 510.108.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro P6000 Off | 00000000:03:00.0 Off | Off |
| 26% 28C P8 9W / 250W | 200MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
You have the flexibility to utilize any Docker images that support GPU, and there are several examples available on how to run these Docker images.
PyTorch
docker run --rm -it --privileged --runtime=nvidia \
-v /dev/shm:/dev/shm \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_COMPABILITIES=all pytorch/pytorch /bin/bash
Tensorflow
docker run --rm -it --privileged --runtime=nvidia \
-v /dev/shm:/dev/shm \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_COMPABILITIES=all tensorflow/tensorflow /bin/bash
Tensorflow-Jupyter
docker run --rm -it --privileged --runtime=nvidia \
--network host \
-v /dev/shm:/dev/shm \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_COMPABILITIES=all tensorflow/tensorflow:latest-gpu-jupyter
Miniconda
docker run --rm -it --privileged --runtime=nvidia \
-v /dev/shm:/dev/shm \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_COMPABILITIES=all continuumio/miniconda3 /bin/bash
To list the running containers, simply execute the docker ps command,
# You should execute the following command on the workstation.
docker ps
To include all the containers present on your Docker host, append the -a option,
# You should execute the following command on the workstation.
docker ps -a
To stop one or more running Docker containers, you can use the docker stop command
# You should execute the following command on the workstation.
docker stop container-name
To start containers,
# You should execute the following command on the workstation.
docker start container-name
And you can kill containers.
# You should execute the following command on the workstation.
docker kill container-name
If you want to re-connect to the container, start the container(if it's not running) and execute an interactive shell.
# You should execute the following command on the workstation.
docker exec -it container-name /bin/bash
By creating a Docker volume for persistent data, it is possible to share data between the workstation and containers.
In the following example, we create a volume mapped to /home/$USER/mydata path. This volume will be mounted in a container. So you can share data between containers.
# You should execute the following command on the workstation.
mkdir /home/$USER/mydata
docker volume create --name mydatastore --opt type=none --opt device=/home/$USER/mydata --opt o=bind
To mount the volume to your container,
# You should execute the following command on the workstation.
docker run -it --runtime=nvidia -v /dev/shm:/dev/shm --mount source=mydatastore,target=/mydata nvidia/cuda:11.6.2-devel-ubuntu20.04 /bin/bash
The volume(mydatastore) is mounded on /mydata in the container.
# You should execute the following command within the container.
ls /mydata
With Visual Studio Code, you can develop code inside a remote container just as easily as on your local machine, and also transfer files back and forth. Follow the instructions in this guide to get started using Visual Studio Code for remote container development.
Simply, restart your container.
# You should execute the following command on the workstation.
docker restart container-name
docker exec -it container-name /bin/bash
And try 'nvidia-smi' within your container.
# You should execute the following command within the container.
nvidia-smi
Verify whether the IP address has been altered. If it has, kindly modify the readme file accordingly and create a pull request (PR).
If the network manager is not accessible on the workstation, please get in touch with your supervisor to execute the following command with sudo permission:
sudo systemctl restart NetworkManager.service