Skip to content

General Server Guide

shuyijia edited this page Oct 30, 2023 · 20 revisions

Table of Contents

  1. Moving .conda to Project Space
  2. Creating Command-line Shortcuts
  3. Remote Development
  4. Mounting the Remote Server Locally
  5. Installing Graphviz without sudo

Moving .conda to Project Space

On a computing cluster, every account is assigned a home directory. On NERSC, the home directory is capped at 40GB while it is 10GB on PACE. This becomes a problem when you have a handful of conda environments and they take up more space than your home directory has.

What is .conda

.conda is a folder that resides in your root directory, you can view it by doing ls -l -a ~/

On some computing systems, the installation location of conda is not always writable to the current user. This happens most often on centralized computing clusters (such as NERSC and PACE) shared by many users. Yet, when creating a new conda environment, anaconda always requires one user-writable location to store package cache and directory information for that particular environment. When the installation location is not accessible, anaconda will write to .conda folder instead.

Create Symbolic Link for .conda

Since the project directory usually has a lot more free space, we can move the .conda folder to the project space and create a symbolic link for it in our root directory. This is equivalent to moving large conda environments to the project space.

Example on NERSC

On NERSC, our project directory is /global/cfs/projectdirs/m3641/. If you have not done so yet, create a folder in it using your name.

  1. If a .conda directory exists in your home directory, you can move it to your folder in the project directory by doing mv ~/.conda /global/cfs/projectdirs/m3641/YOURNAME/.. If there is no .conda directory, you can create one in the project directory by doing mkdir /global/cfs/projectdirs/m3641/YOURNAME/.conda.
  2. Create a symbolic link to the .conda directory in the project space by ln -s /global/cfs/projectdirs/m3641/YOURNAME/.conda ~/.conda.

If you do ls -l -a ~/, you will see the following:

lrwxrwxrwx    1 username username     43 Oct 11 15:26 .conda -> /global/cfs/projectdirs/m3641/YOURNAME/.conda/

The arrow -> shows that our .conda is actually pointing to the location in the project space.

Creating Command-line Shortcuts

Functions

We can make life a lot easier with command-line shortcuts. For instance, to gain an interactive computing session on NERSC, we would need to type in the following:

salloc --nodes 1 --qos interactive --time 04:00:00 --constraint gpu --gpus 4 --account=m3641_g

We can save this as a bash function to ~/.bashrc.

Note that both NERSC and PACE uses bash shell and thus we will need to make modification to ~/.bashrc. It is best to first check which shell an operating system is using by doing echo $SHELL.

To edit ~/.bashrc, we can open it by

  • vim ~/.bashrc on NERSC (check out this vim cheatsheet),
  • nano ~/.bashrc on PACE (check out this nano cheatsheet),
  • Remote development using VS Code for both NERSC and PACE.

At the bottom of ~/.bashrc, we can add

four_gpus(){
salloc --nodes 1 --qos interactive --time 04:00:00 --constraint gpu --gpus 4 --account=m3641_g
}

Here, four_gpus is the name of the bash function we created. After saving and closing ~/.bashrc, we need to reload it by

source ~/.bashrc

This is because .bashrc is run only once after the shell starts (which means that you don't have to do this command in your next login).

Now, if your type four_gpus and press Enter, you will see that you are requesting for an interactive session.

Aliases

Aliases are another powerful supplement to the current commands. An alias can be created by adding the following line to your ~/.bashrc:

alias l='ls -l -a'

This means that we are creating a new command l which is equivalent to the command ls -l -a.

Since we often need to access certain paths in our project space, we can add the following to ~/.bashrc:

alias datasets="cd /path/to/dataset/folder/"

Remote Development

There are many options to do remote development using an IDE. Here we include instructions for setting up a couple of them. If you use a different one than listed, please update this page with the steps needed.

PyCharm (JetBrains Gateway)

PyCharm (Professional version) is available for free using your Georgia Tech email

Pre-requisite:

  1. Install PyCharm
  2. Open PyCharm and ensure that the Remote Development Gateway Plugin is enabled

Set up Remote Development:

  1. From the Pycharm welcome screen (or from the file menu if another project is already open), select Remote Development
  2. Under Collections, select SSH then New Connection
  3. Configure the remote server connection parameters: Username, Host, and specify the private key generated by sshproxy.sh script: ~/.ssh/nersc
  4. Click Check Connection and Continue
  5. Specify the github project directory within your folder in the group project.
  6. Click Start IDE and Connect - PyCharm will download the IDE backend and will launch JetBrains Client (default download will be in your home directory under .cache/, however, if there is not enough space, you can customize the installation path in the previous step)

After setting up the connection, the configuration should look something like this:

Screen Shot 2022-11-26 at 2 23 43 PM

For more info, see here

VS Code

You can do remote development through the VS Code Remote Development Extension Pack.

Pre-requisite

  1. Install Visual Studio Code
  2. Install the VS Code Remote Development Extension Pack
  3. Set up ~/.ssh/config

To set up ~/.ssh/config, simply open the file using any editor and add a remote server to the bottom of the file according to the following format:

Host <server-name>
  User <username>
  Hostname <url or ip address>
  IdentityFile <path to ssh key for passwordless connection>

The following is an example based on NERSC Perlmutter:

Host pmutter
  User <your-username>
  Hostname perlmutter-p1.nersc.gov
  IdentityFile ~/.ssh/nersc

Using Remove Development

Once the Remove Development extension pack is installed, access the VS code command palette by doing

  • Shift + Command + P on Mac
  • Shift + Ctrl + P on Windows/Linux

In the command palette, search for Remote-SSH: Connect to Host.... Press Enter and you will see a list of available servers as indicated in your ~/.ssh/config file.

Choose the server you wish to connect. If password-less connection is not enabled, VS code will prompt you to enter your password. You can click on the File Explorer on the sidebar menu to open a directory or project (password is prompted one more time).

Mounting the Remote Server Locally

One option for mounting the file system locally is to use rclone with sftp/ssh.

Setup:

  1. Install rclone. M1 Mac users see installation steps below

  2. Run:

    rclone config
    
  3. Follow steps to set up the config for the remote server using sftp. Afterwards, config should be similar to this:

    Options:
    - type: sftp
    - host: perlmutter-p1.nersc.gov
    - user: eisenach
    - key_file: /Users/seisenach3/.ssh/nersc
    - idle_timeout: 20m0s
    

    Note: path to key_file must be the absolute path. Also, set up idle_timeout in advanced config settings if you'd like a longer timeout (default is 1 minute)

  4. Check that the connection works:

    rclone lsd <server_name>:
    
  5. Create aliases for mounting and unmounting the server:

    alias mount_remote="rclone mount <server_name>:/global/cfs/projectdirs/m3641/ <path_to_local_files> --volname nersc_server --daemon"
    alias unmount_remote="umount <path_to_local_files>"
    

    The --daemon flag forces the process to run in the background mode, however, on Windows, only foreground mode is available, so this flag will be ignored.

M1 Mac Installation:

  1. set up FUSE
    brew install --cask macfuse
    
  2. Enable support for third-party kernel extensions
  3. Install rclone using the fuse-dependent rclone formula:
    brew install gromgit/fuse/rclone-mac
    
  4. Point rclone to rclone-mac by adding it in the ~/.zshrc:
    PATH="/opt/homebrew/opt/rclone-mac/libexec/rclone:$PATH"
    
  5. Follow steps above

Installing Graphviz without sudo

Graphviz is an open source graph visualization software that is needed in generating the computational graph in PyTorch, among many other uses. This guide outlines the steps of installing Graphviz on servers without root/sudo access.

Download and Install

We need to download the source packages from this link. Typically, you can simply right-click on a stable release link with a '.gz' extension and select 'Copy Link Address'. Here, I am using graphviz-9.0.0.gz as an example.

  1. Download
    wget https://gitlab.com/api/v4/projects/4207231/packages/generic/graphviz-releases/9.0.0/graphviz-9.0.0.tar.gz
    
  2. Extract
    tar -xf graphviz-9.0.0.tar.gz
    
  3. Make a folder for Graphviz installation
    mkdir ~/.graphviz
    
  4. Go to the extracted Graphviz folder
    cd graphviz-9.0.0
    
  5. Configure the installation
    ./configure --prefix=~/.graphviz
    
  6. Install
    make
    make install
    

Add to PATH

Open ~/.bashrc in any editor of your choice (e.g. nano ~/.bashrc) and add the following two lines

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.graphviz/lib
export PATH="$HOME/.graphviz/bin:$PATH"

Remember to do source ~/.bashrc to reload the file.

Verify Installation

We can check the installed Graphviz version by

dot -V

To run a simple test, do

echo 'digraph { a -> b }' | dot -Tsvg > output.svg

A file named output.svg should be generated. If so, you are all set!