- This repository contains the Ansible playbooks, roles, etc for the European Galaxy server. It is used to deploy the infrastructure for the European Galaxy server through Jenkins. All configurational changes related to the Galaxy EU are made through this repository.
- Galaxy EU compute infrastructure is run on the BW OpenStack cloud. At the time of writing (21/02/2023) our cloud is of size 8488 VCPUs, 44.6 TB RAM, 162.6 TB storage. Additionally, a few petabytes of storage is also mounted (NFS) in the cloud.
- The compute infrastructure (cloud cluster; Galaxy worker nodes) is configured through VGCN infrastructure repo where we define what cloud images should be used, the size of the cloud cluster, the number of VMs, the cloud network, the cloud security groups, etc.
- The cloud (VMs for group members and non Galaxy worker nodes) is configured through this infrastructure repo using Terraform. The underlying cloud hardware, storage, network, etc are managed by the compute center of the University of Freiburg. For DNS records we use Amazon's Route53.
- Some documentation related to services and IT operations are available in this operations repo
- For Galaxy Admin training you can refer here
- For monitoring of the Galaxy EU infrastructure we use Grafana. The dashboards are available here
- Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code.
- The basic components of Ansible can be found here. Understanding this is important to understand this repo.
- files: contains files/configs that are used by the playbooks/roles
- group_vars and host_vars: contains variables that are used by the playbooks specific to certain host/group defined in the inventory file. For every playbook we have an associated group_vars/host_vars file where we define the variables that are used by the playbook and the roles that are included/imported in the playbook.
- roles: Contains roles that are not maintained somewhere else, which is not typical for Ansible. All other roles are installed during deployment from the requirements.yaml file.
- secret_group_vars: This is our vault. It contains the passwords and other sensitive information that is used by the playbooks/roles. The files are encrypted using Ansible Vault.
- templates: contains the templates that are used by the playbooks/roles. The templates are used to generate the final configuration files that are used by the services. The templates are written in Jinja2 syntax.
- ansible.cfg: contains the configuration of Ansible. This is used to define the location of the inventory file, the vault password file, etc.
- hosts: contains the inventory of the hosts that are managed by Ansible.
The playbooks are located in the root directory of the repo.
The playbooks are:
- apollo.yml: Apollo is a genome annotation web-based editor. This can be accessed through Galaxy to view, edit, and annotate genomes. Uses Tomcat, therefore run on a separate server. Additional information can be found here
- beacon.yml: Beacon is a service that allows to query for the presence of specific variants in a given dataset. We provide this service as part of our Galaxy EU instance. This is run on a VM on the cloud.
- build.yml: This playbook is used to setup our Jenkins server.
- celery.yml: This playbook is used to setup the Celery node(s) that are used by Galaxy for running various jobs. For more information refer to this doc and our training material
- cvmfs.yml: This playbook is used to setup the CernVM-FS server that is used by Galaxy to serve the reference data. Refer to these training materials for more information: Reference Data with CVMFS, and Reference Data with CVMFS without Ansible
- galaxy-test.yml: This playbook is used to setup the Galaxy test instance. This is used to perform tests on the Galaxy codebase before deploying it to the main Galaxy instance.
- grafana.yml: This playbook is used to setup the Grafana server that is used to monitor our Galaxy instance. Refer to this doc and here is the training material Galaxy Monitoring with Telegraf and Grafana
- incoming.yml: This playbook is used to setup our incoming FTP server through which users can upload their data to Galaxy. Though this service is not retired, now we use tus.io for uploading data to Galaxy. Our training materials on: TUS and FTP.
- influxdb.yml: This playbook is used to setup the InfluxDB server that is used to store the metrics that are collected by Telegraf. Refer to this and this doc. Here is the training material
- mq.yml: This playbook is used to setup the RabbitMQ server that is used by Galaxy. Refer here and here for details.
- plausible.yml: This playbook is used to setup the Plausible server that is used to collect the analytics for our Galaxy instance.
- sn05.yml: This playbook is used to setup the Galaxy PostgreSQL database server and also the HTCondor cluster manager.
- sn06.yml: This playbook configures the Galaxy server. This is the main Galaxy server that is used by the users. This we denote as
headnode 1
. Refer to this training material to set up Galaxy. - sn07.yml: This playbook also configures the
galaxy server
but this is not in production (for now 22/02/2023). This we denote asheadnode 2
. Refer to this training material to set up Galaxy. - syn-to-nfs.yml: This playbook is used to sync the data of the Galaxy codebase on
headnode 1 (sn06)
to a NFS server. This is then synced to all nodes that needs the up-to-date Galaxy codebase and configuration files. - telescope.yml: This playbook is used to setup the Galactic Radio Telescope
- upload.yml: This playbook sets up the TUS server that is used to upload data to Galaxy. Refer to this training material to set up TUS.
Our locally maintained Ansible roles are located in the roles directory. Also, we maintain several other roles and all of them are in their own github repositories and can be found in our organization. Most of these roles are published on Ansible Galaxy. In addition to our roles we also use roles from the galaxyproject All the roles (non-local) we use are listed in our requirements.yaml file. These roles can be installed by running the following command:
ansible-galaxy install -r requirements.yaml
- Separate repo: Whether the role has its own repo or is it a local role located and available only in the infrastructure_playbook repo
- Still being used: Whether the role is included/imported in any of the above listed playbooks
Roles | Separate repo | Still being used | Description |
---|---|---|---|
devops.tomcat7 | ✔️ | Installs Tomcat 7 on RedHat/CentOS Linux servers | |
dj-wasabi.telegraf | ✔️ | Installs and configures telegraf | |
docker | Installs and configures docker; sets up SSL certificates | ||
galaxyprojectdotorg.proftpd | ✔️ | Installs, configures and manges proftpd (FTP) server. | |
geerlingguy.haproxy | Installs HAProxy | ||
geerlingguy.nginx | Installs and configures Nginx | ||
hostname | ✔️ | Set's system's hostname | |
htcondor | Installs and configures HTCondor | ||
hxr.admin-tools | ✔️ | Install's some admin packages via the package manager and stops firewalld service if its installed | |
hxr.api-check | Installs a bash script to check the http status | ||
hxr.apollo | ✔️ | Installs and configures a genome annotation web-based editor | |
hxr.autofs | ✔️ | Installs autofs and adds autofs configuration to mount needed NFS shares (NOTE: This should be merged/replaced with usegalaxy-eu.autofs role at some point) | |
hxr.autofs-format-n-mount | Copies a script to format a certain disk and mount it | ||
hxr.aws-cli | ✔️ | Creates AWS directory (~/.aws ) and deploys (copies) AWS credentials to a given system user account |
|
hxr.dns | Sets DNS entries using route53 and refreshes the certbot certificates if domains have changed |
||
hxr.docker-ssl | Configures docker to use SSL certificates | ||
hxr.docker-ssl-client | Adds SSL certs to the user home directory | ||
hxr.exclude-repo | Excludes a given list of YUM repositories | ||
hxr.galaxy-cron | Adds cron jobs for cleaning up docker (via prune ) and for cleaning up condor held jobs |
||
hxr.galaxy-echo-tool | Add a custom nagios "echo" tool to the Galaxy tool directory | ||
hxr.galaxy-log-dir | Creates galaxy log directory if it does not exist | ||
hxr.galaxy-nonreproducible-tools | ✔️ | Clones temporary tools repo to the Galaxy's custom tools directory | |
hxr.grafana-gitter-bridge | ✔️ | Configures a bridge between Grafana and Gitter | |
hxr.gx-cookie-proxy | Sets up and configures the translation of a galaxy session cookie into a remote user identity | ||
hxr.haproxy-error-pages | Downloads Galaxy error pages through a bash script (NOTE: Bash script is not found in the role, so can't explain what these error pages are) | ||
hxr.install-to-venv | ✔️ | Installs Python dependencies into any requested virtual environment | |
hxr.monitor-cluster | ✔️ | Adds Condor cluster monitoring scripts and configures Telegraf user to run these scripts | |
hxr.monitor-cvmfs | ✔️ | Adds a Telegraf task that monitors the CernVM-FS repos | |
hxr.monitor-email | ✔️ | Adds an /var/spool/mail counter script and adds an Telegraf task to monitor |
|
hxr.monitor-galaxy | NOTE: Tasks file is empty | ||
hxr.monitor-galaxy-journalctl | ✔️ | Adds a script and a Telegraf task to monitor Galaxy's journalctl logs | |
hxr.monitor-galaxy-queue | Adds a Telegraf task to run gxadmin queries for monitoring Galaxy queue and workflow-incovcation-status |
||
hxr.monitor-squid | ✔️ | Adds a squid proxy data parser script and a Telegraf task to monitor | |
hxr.monitor-ssl | ✔️ | Adds an SSL check script and a Telegraf task to monitor SSL certificates expiry | |
hxr.postgres-connection | ✔️ | Adds Galaxy database credentials and required ENVs to the user's bashrc file and creates a ~/.pgpass file as well |
|
hxr.remap-user | ✔️ | Remaps system user's UID and GID to the tomcat user/group | |
hxr.replace-galaxy-user | ✔️ | Creates a system user and group with 999 as UID and GID | |
hxr.sentry | Installs and configures Sentry service | ||
hxr.simple-nagios | Installs a few "simple" scripts that performs various tasks related to Galaxy, Nagios, and SSL certificate check | ||
hxr.zfs-monit | NOTE: Task file does not exist | ||
jasonroyle.rabbitmq | Installs and configures RabbitMQ | ||
linuxhq.yum_cron | ✔️ | Installs yum-cron and adds required configuration |
|
matterircd | Sets up a minimal IRC server using Docker that can integrate with mattermost, slack, and mastodon | ||
multinic | Adds network config files and configures multiple NICs | ||
multinic-old | Same as multinic but without routing config etc. | ||
pgs | ✔️ | Sets upProbes Public Galaxy Servers (pgs) instances | |
sentry | Sets up Sentry a realtime event logging and aggregation platform using Docker | ||
ssh-host-resign | ✔️ | Copies server CA and signs the Host SSH keys | |
ssh-host-sign | ✔️ | Sign the server host key to prevent TOFU for SSH | |
usegalaxy-eu.bashrc | ✔️ | Adds ENVs, aliases, etc. to the bashrc file for any given user | |
usegalaxy-eu.create-user | ✔️ | Creates a galaxy system user and group |
|
usegalaxy-eu.error-pages | ✔️ | Copies Nginx's error (404, 502, 503, and 504) pages | |
usegalaxy-eu.fix-ancient-ftp-data | ✔️ | Removes old FTP data and adds a cron job to create FTP users | |
usegalaxy-eu.fix-failing-to-fail-jobs | ✔️ | Adds a cron job to fix failing to fail jobs | |
usegalaxy-eu.fix-galaxy-server-dir | ✔️ | Creates a GDPR compliance log file if it does not exist and creates a symlink of all tools present in /usr/local/tools to <galaxy_server_dir>/dependencies |
|
usegalaxy-eu.fix-missing-api-keys | ✔️ | Adds a cron job that generates and adds missing API keys for IE (Interactive Environments) users | |
usegalaxy-eu.fix-oidc | ✔️ | Adds a cron job that finds all of the OIDC authenticated users that do not have any roles associated to them and fixes them | |
usegalaxy-eu.fix-stop-ITs | ✔️ | Adds a cron job that finds the Galaxy ITs running longer than 24 hrs and terminates them | |
usegalaxy-eu.fix-stuck-handlers | ✔️ | Adds several cron jobs (sync-to-nfs, restart galaxy handlers systemd service, restart gunicorn systemd service, restart galaxy workflow schedulers systemd service) | |
usegalaxy-eu.fix-unscheduled-jobs | ✔️ | Adds a cron job that finds the Galaxy jobs that failed to run (unscheduled) and sets its state to error in the Galaxy database |
|
usegalaxy-eu.fix-unscheduled-workflows | ✔️ | Adds a cron job (the Ansible task is commented, so it does not create a cron job at the moment) that fixes unscheduled workflows | |
usegalaxy-eu.fix-user-quotas | ✔️ | Adds cron jobs that recalculates user quotas and sets ELIXIR quota for ELIXIR users | |
usegalaxy-eu.galactic-radio-telescope | ✔️ | Installs and configures Galactic Radio Telescope | |
usegalaxy-eu.galaxy-cleanup | ✔️ | Adds a Telegraf task that performs a cleanup of histories/hdas/etc that are older than 60 days | |
usegalaxy-eu.galaxy-procstat | ✔️ | Adds Telegraf procstat tasks that collects metrics from processes (Gunicorn, Galaxy handlers, Galaxy workflow schedulers) | |
usegalaxy-eu.galaxy-slurp | ✔️ | Adds cron jobs for pulling Galaxy stats (like, how many users were registered as of date X, current user count, current dataset size/distribution/etc.) into InfluxDB using gxadmin 's slurp commands |
|
usegalaxy-eu.gapars-galaxy | ✔️ | Sets up and installs the GAPARS Galaxy webhook | |
usegalaxy-eu.gie-deployer | Creates GIE (Galaxy Interactive Environments) required directories, adds config, etc to deploy GIE | ||
usegalaxy-eu.gie-node-proxy | Clones GIE NodeJS proxy configurations and installs Node dependencies and sets up the GIE proxy | ||
usegalaxy-eu.google-verification | Adds Google site verification HTML file and adds required Nginx configuration | ||
usegalaxy-eu.grt-client | Adds cron jobs that can export and upload data to GRT | ||
usegalaxy-eu.grt-export | Adds a cron job that exports data to GRT | ||
usegalaxy-eu.htcondor_release | ✔️ | Adds a cron job that releases Condor jobs that are in hold state (also removes jobs in hold state that are resubmitted more than two times) | |
usegalaxy-eu.jenkins-ssh-key | ✔️ | Creates SSH directory and adds a key to the Jenkins user | |
usegalaxy-eu.log-cleaner | ✔️ | Adds cron job to clean up old journalctl logs of gunicorn and galaxy handlers services | |
usegalaxy-eu.logrotate | ✔️ | Adds logrotate configuration for galaxy and atop logs | |
usegalaxy-eu.monitoring | ✔️ | Adds Telegraf tasks for monitoring NFS shares access times, and to collect NFS stats | |
usegalaxy-eu.plausible | ✔️ | Clones Plausible Analytics setup and adds the configuration and starts the service using Docker | |
usegalaxy-eu.remap-user | Remaps system user and group with UID and GID 999 to a different UID and GID so Galaxy user can be created with 999 UID and GID | ||
usegalaxy-eu.rsync-to-nfs | ✔️ | Adds and executes a script that performs a Rsync operation of the Galaxy root directory to NFS location | |
usegalaxy-eu.subdomain-themes | ✔️ | Adds custom subdomain themes (HTML and CSS files) | |
usegalaxy-eu.tours | ✔️ | Clones Galaxy tours repo | |
usegalaxy-eu.webhooks | ✔️ | Clones webhooks repo | |
usegalaxy-eu.vgcn-monitoring | ✔️ | Adds VGCN monitoring python script and a Telegraf configuration file | |
dev-sec.os-hardening | ✔️ | ✔️ | Now, part of devsec.hardening collection. This role provides numerous security-related configurations, providing all-round base protection to the system |
dev-sec.ssh-hardening | ✔️ | ✔️ | Now, part of devsec.hardening collection. This role provides secure ssh-client and ssh-server configurations. |
devops.tomcat7 | ✔️ | ✔️ | Installs Tomcat 7 on RedHat/CentOS Linux servers |
dj-wasabi.telegraf | ✔️ | ✔️ | Installs and configures Telegraf |
galaxyproject.galaxy | ✔️ | ✔️ | Installs and configures Galaxy |
galaxyproject.cvmfs | ✔️ | ✔️ | Install and configure CernVM-FS (CVMFS), particularly for Galaxy servers. |
galaxyproject.proftpd | ✔️ | ✔️ | Installs, configures and manges proftpd (FTP) server. |
usegalaxy_eu.ansible_nginx_upload_module | ✔️ | ✔️ | Role for building the Nginx upload module |
usegalaxy-eu.nginx | ✔️ | ✔️ | Role for installing and managing nginx servers |
galaxyproject.nginx | ✔️ | ✔️ | Role for installing and managing nginx servers |
galaxyproject.postgresql | ✔️ | Role for installing and managing PostgreSQL servers | |
usegalaxy-eu.ansible-postgresql | ✔️ | ✔️ | Role for installing and managing PostgreSQL servers |
geerlingguy.docker | ✔️ | ✔️ | Role for installing Docker |
geerlingguy.java | ✔️ | ✔️ | Role for installing Java |
geerlingguy.jenkins | ✔️ | ✔️ | Role for installing Jenkins CI |
geerlingguy.repo-epel | ✔️ | ✔️ | Installs the EPEL repository |
influxdata.chrony | ✔️ | ✔️ | Manages the Chrony services on Linux. |
linuxhq.yum_cron | ✔️ | ✔️ | Installs yum-cron and adds required configuration |
galaxyproject.gxadmin | ✔️ | ✔️ | Installs and configures gxadmin |
usegalaxy-eu.certbot | ✔️ | Installs and configures Certbot (for Let's Encrypt). | |
usegalaxy_eu.galaxy_systemd | ✔️ | ✔️ | Copies systemd service files and starts processes for Gunicorn handlers, Galaxy (workflow) handlers and celery. Important to configure those processes |
usegalaxy-eu.dynmotd | ✔️ | ✔️ | Sets up a dynamic message-of-the-day login prompt |
cloudalchemy.grafana | ✔️ | ✔️ | Role for provisioning and managing Grafana platform for analytics and monitoring |
galaxyproject.tiaas2 | ✔️ | ✔️ | Install and configure TIaaS (Training Infrastructure as a Service) |
usegalaxy-eu.autoupdates | ✔️ | ✔️ | Sets up automatic system Updates using Dnf-automatic |
usegalaxy_eu.htcondor | ✔️ | ✔️ | Role for installing and configuring HTCondor |
usegalaxy-eu.update-hosts | ✔️ | ✔️ | Adds a cron job to update computing nodes list in a HTCondor managed cluster |
usegalaxy_eu.gie_proxy | ✔️ | ✔️ | Install and configure the proxy server used by Galaxy for IE (Interactive Environments) /IT (Interactive Tools) |
usegalaxy-eu.autofs | ✔️ | ✔️ | Installs autofs and configures mount points for auto mounting |
usegalaxy_eu.fs_maintenance | ✔️ | ✔️ | Role for deploying and configuring some common Galaxy file system maintenance routines and also adds cron jobs |
galaxyproject.tusd | ✔️ | ✔️ | Installs and configures the tusd server |
usegalaxy_eu.rabbitmqserver | ✔️ | ✔️ | Role to deploy and configure a RabbitMQ server using a docker container |
usegalaxy_eu.influxdbserver | ✔️ | ✔️ | Role to deploy and configure an InfluxDB server using a docker container |
usegalaxy_eu.flower | ✔️ | ✔️ | Role for installing Celery's Web UI Flower. |
paprikant.beacon | ✔️ | ✔️ | Role that sets up a running instance of beacon-python, with an accompaning PostgreSQL database |
paprikant.beacon-importer | ✔️ | ✔️ | Sets up Beacon importer and adds a cron job for the import task |
galaxyproject.miniconda | ✔️ | ✔️ | Role for installing and managing Miniconda installation and Conda environments |
usegalaxy_eu.tpv_auto_lint | ✔️ | ✔️ | Adds a TPV (Total Perspective Vortex) lint script that automatically lints all TPV YAML files |