Skip to content

HOWTO: Alces HPC Puppet Modules

Steve Norledge edited this page May 14, 2015 · 5 revisions

Build a simple HPC cluster with GridEngine using Alces Symphony Deployment suite and the Alces HPC puppet module

Assumptions

  • 1 headnode
  • Network consists of:
  • 1 management / build network, holding all bmc ips, infrastructure ips as well as used for primary management interface for each machine (eth1|10.78.0.0/16 on all vms in this example)
  • 1 private cluster LAN for cluster storage / MPI traffic (eth0|10.10.0.0/16 in this example)
  • 1..* login nodes
  • 1..* compute nodes

Pre-requisites

  • Prepare VMs, in this example we use standard vm containers with 2 virtual interfaces (eth0,eth1)
name=$1
if [ -z $name ]; then
  echo "Please specify tail" >&2
  exit 1
fi

if [ -f /opt/vm/headnode1.$name.qcow2 ]; then
  echo "$name already exists" >&2
  exit 1
fi

qemu-img create -f qcow2 /opt/vm/headnode1.$name.qcow2 32G

virsh pool-refresh local


virt-install --connect qemu:///system \
    --virt-type kvm \
    --pxe \
    --network bridge=build.sfn2 \
    --network bridge=build.sfn \
    --os-variant rhel6 \
    --name headnode1.$name \
    --disk vol=local/headnode1.$name.qcow2,format=qcow2 \
    --ram 4096 \
    --vcpus=2 \
    --check-cpu \
    --hvm \
    --graphics=vnc,keymap=en-gb \
    --accelerate
  • Add dns management for your private cluster LAN in cobbler by adding the domain tail to the relevant manage_forward_zones and manage_reverse_zones in cobblers /etc/cobbler/settings file, generate the zone templates and place them in /etc/cobbler/zone_templates - you can use the original files in the folder, or there is /etc/cobbler/zone.template which you can also use as a template.
  • Add dns management for any supplementary LANS in /etc/cobbler/settings eg ib.mycluster.local and place file in zone_templates
  • Prepare the alces-hpc puppet environment
mkdir -p /etc/puppet/environments/alceshpc/; cd /etc/puppet/environments/alceshpc/
curl https://raw.githubusercontent.com/ste78/puppet-alcesbase/symphony/autoclone > autoclone
/bin/bash autoclone
  • Import additional alces hpc repos and modify symphony hiera conf to enable those repos at installation for all intended machines
#On symphony repo
/var/lib/symphony/repo/bin/import_pulp.sh el6alceshpc

Headnode

  • Add a headnode to cobbler:

  • Create <hostname.yaml> hiera override file on symphony-director:

cat << EOF > /etc/puppet/environments/symphony/hieradata/headnode1.yaml
symphonyrepo::yum::enable_symphonyrepos:
 - epel
 - symphony
 - alceshpc-base
 - alceshpc-extras
EOF
  • Copy symphony keys and config to the new headnode
scp /root/.ssh/config headnode1:/root/.ssh/.
scp /root/.ssh/id_symphony* headnode1:/root/.ssh/.

Configure Headnode

Perform all following commands on the newly install headnode

  • Create alces_stack file
cat << EOF > /etc/alces_stack.yaml
---
:machine: headnodemachine
:role: slave
:hostname: `hostname -s`
EOF
  • Run puppet
rm -f /etc/yum.repos.d/symphony.repo
puppet agent -t --environment symphony
puppet agent -t --environment alceshpc
  • Install the symphony client tools
curl -L "https://raw.githubusercontent.com/alces-software/clusterware/master/scripts/bootstrap" | /bin/bash
bash -l
alces facility install stack https://github.com/alces-software/clusterware-stack
alces facility install packager https://github.com/alces-software/clusterware-packager
  • Run SGE init
/var/lib/alces/bin/init-gridscheduler.sh
  • Reboot the headnode

  • Re-run the gridscheduler install script

service qmaster.alces-gridscheduler start
/var/lib/alces/bin/init-gridscheduler.sh
  • Configure a /etc/genders file

Compute node

  • Create alces_stack file
cat << EOF > /etc/alces_stack.yaml
---
:machine: computemachine
:role: slave
:hostname: `hostname -s`
EOF
  • Run puppet
rm -f /etc/yum.repos.d/symphony.repo
puppet agent -t --environment symphony
puppet agent -t --environment alceshpc

Loginnode

  • Create alces_stack file
cat << EOF > /etc/alces_stack.yaml
---
:machine: loginmachine
:role: slave
:hostname: `hostname -s`
EOF
  • Run puppet
rm -f /etc/yum.repos.d/symphony.repo
puppet agent -t --environment symphony
puppet agent -t --environment alceshpc
Clone this wiki locally