This document aims at setting up a practical development environment for Python projects, allowing the integration of binary extension modules based on C++ or Fortran. Developing on a local machine, a desktop or a laptop, is often somewhat more practical than developing on the cluster. Typically, I start developing on my own machine until things are working well, and then I port the code to the cluster for further testing. I switch back and forth between both environments several times.
There are important differences in managing your environment on your local machine and on the cluster. They are described in detail in :ref:`tutorial-6`.
Warning
Micc was designed for supporting HPC developers, and, consequentially, with Linux systems in mind. We provide support for Linux (Ubuntu 19.10, CentOS 7.7), and macOS. Due to lack of human resources, it has not been tested on Windows, and no support is provided for it. However, WSL-2 may do the trick on Windows. Any feedback is welcome
If you want to experiment with micc without having to setup the environment, You can download
anb Ubuntu 20.10 virtual machine for VirtualBox with everything pre-installed at
https://calcua.uantwerpen.be/courses/parallel-programming/ubuntu-20.10.ova. It has a userid
user
with password calcua@ua
.
For Python development on your local machine, we highly recommend to set up your development environment as described in My Python Development Environment by Jacob Kaplan-Moss. We will assume that this is indeed the case for all tutorials here. In particular:
- We use pyenv to manage different Python versions on our system.
- Pipx is used to install Python applications
system-wide. If your projects depend on different Python versions it is a good idea to
pipx install
Micc, which we use for project management and and building binary extension modules. - Poetry is used to set up virtual environments for the projects we are working, for managing their dependencies and for publishing them.
- For building binary extension modules from C++ CMake must be available.
- For Micc projects with binary extension the necessary compilers (C++, Fortran) must be installed on the system.
- As an IDE for Python/Fortran/C++ development we recommend:
- Eclipse IDE for Scientific Computing with the PyDev plugin. This is an old time favorite of mine, although The learning curve is a bit steep and documentation could be better. Today, PyDev is beginning to lag behind for Python, but Eclipse is still very good for Fortran and C++.
- PyCharm Community Edition. I only tried this one recently and was very soon convinced for python development. (Didn't go back to Eclipse once since then). I currently have insufficient experience for Fortran and C++ for making recommendations.
Note
The steps below are only suitable on your local laptop or desktop. For working on the VSC clusters, a separate tutorial is provided (:ref:`tutorial-6`).
Install pyenv: See Managing Multiple Python Versions With pyenv for common install instructions on macos and Linux.
Note
Since Ubunty 20.10 the dependencies for pyenv can best be installed as shown in asdf-vm/asdf#570 . The realpython page above is not up to date.
If you're on Windows, consider using the fork pyenv-win. (Pyenv does not work on windows outside the Windows Subsystem for Linux).
Install your favourite Python versions. E.g.:
> pyenv install 3.8.0
Install poetry. The recommended way for this is:
> curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
This approach will give you one single system-wide Poetry installation, which will automatically pick up the current Python version in your environment. Note, that as of Poetry 1.0.0, Poetry will also detect conda virtual environments.
Configure your poetry installation:
> poetry config virtualenvs.in-project true This ensures that running ``poetry install`` in a project directory will create a project's virtual environment in its own root directory, rather than somewhere in the Poetry_ configuration directories, where it is less accessible. If you have several Poetry_ installations, they all use the same configuration.
Install pipx
> python -m pip install --user pipx > python -m pipx ensurepath
Note
This will use the Python version returned by
pyenv version
. Micc is certainly comfortable with Python 3.7 and 3.8.Install micc with pipx:
> pipx install et-micc installed package et-micc 0.10.8, Python 3.8.0 These apps are now globally available - micc done!
To upgrade micc to the newest version run:
> pipx upgrade et-micc
To upgrade to a newer version of a tool that you installed with pipx, use the
upgrade
command:> pipx upgrade et-micc et-micc is already at latest version 0.10.8 (location: /Users/etijskens/.local/pipx/venvs/et-micc)
If you want to develop binary extensions in Fortran or C++, you will need a Fortran compiler or a C++ compiler, respectively. For C++ binary extensions, also CMake and make must be on your system PATH. You can download CMake directly from cmake.org.
If you are on one of the VSC clusters, check "Tutorial 7 - Using micc projects on the VSC clusters".
Install an IDE. For many years I have been using Eclipse IDE for Scientific Computing with the PyDev plugin, but recently I became addicted to PyCharm Community Edition. Both are available for MacOS, Linux and Windows.
Create a git account at https://github.com>/join/. Also create a personal access token At point 7 check at least these boxes:
- repo
- read:org
At point 9 copy the toke to the clipboard and paste it in :file:`~/.pat.txt`:
> echo shift+ctrl+V > ~/.pat.txt
Micc uses this file to automatically create a GitHub repo for your micc projects.
Install
git
(https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) and the github cligh
(https://github.com/cli/cli#installation).Create your first micc project. The very first time, you will be asked to set some default values that identify you as a micc user. Replace the preset values by your own preferences:
> micc -p my-first-micc-project create your full name [Engelbert Tijskens]: carl morck your e-mail address [[email protected]]: [email protected] your github username (leave empty if you do not have) [etijskens]: cmorck the initial version number of a new project [0.0.0]: default git branch [master]:
The last two entries are generally ok. If you later want to change the entries, you can simply edit the file :file:`~/.et_micc/micc.json`.
You should be good to go now.
For details see :ref:`Tutorial-6`
On the cluster you must select the software packages you want to use manually by loading modules with the module system The module system provides access to the many pre-installed software packages - including Python versions - that are especially built for HPC purposes and optimal performance. They are generally, much more performant than if you would have built them yourself. It is, therefor, discouraged to install pipx to your own Python versions.
Install poetry. The recommended way for this is:
> curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | /usr/bin/python
(Make sure to use the system Python
/usr/bin/python
for this. Otherwise you will run into trouble selecting a Python version for your project.) This approach will give you one single system-wide Poetry installation, which will automatically pick up the current Python version in your environment.Configure your poetry installation:
> poetry config virtualenvs.in-project true This ensures that running ``poetry install`` in a project directory will create a project's virtual environment in its own root directory, rather than somewhere in the Poetry_ configuration directories, where it is less accessible.
For micc projects that are cloned from a git repository, we recommend install micc as a development dependency of your project:
> cd path/to/myproject > poetry add --dev
If you want to create a new project with micc, you must install it first of course:
> module load Python # load your favourite Python module > pip install --user et-micc
Without the
--user
flag pip would try to install in the cluster module, where you to not have access. The flag instructs pip to install in your home directory.If you want to develop binary extensions in Fortran or C++, you will need a Fortran compiler and/or a C++ compiler, respectively. In general, loading a Python module on the cluster, automatically also makes the compilers available that were used to compile the Python version.
For C++ binary extensions, also CMake must be on your system PATH:
> module load CMake
If you need a full IDE, you must use one of the graphical environments on the cluster (see https://vlaams-supercomputing-centrum-vscdocumentation.readthedocs-hosted.com/en/latest/access/access_and_data_transfer.html#gui-applications-on-the-clusters) Unfortunately, there are different gui environments for the different VSC clusters. If you only want a graphical editor, you can use Eclipse Remote system explorer as a remote editor.
Get a git account at github, install git if is is not pre-installed on your system, and configure it:
> module load git # for a more recent git version > git config --global user.email "[email protected]" > git config --global user.name "Your Name"
Create your first micc project. The very first time, y ou will be asked to set some default values that identify you as a micc user. Replace the preset values by your own preferences:
> micc -p my-first-micc-project create your full name [Engelbert Tijskens]: carl morck your e-mail address [[email protected]]: [email protected] your github username (leave empty if you do not have) [etijskens]: cmorck the initial version number of a new project [0.0.0]: default git branch [master]:
The last two entries are generally ok. If you later want to change the entries, you can simply edit the file :file:`~/.et_micc/micc.json`.
You should be good to go now.
Create a bash script to set the environment for your project consistently over time, e.g.:
#!/usr/bin/bash module load git module load CMake # load my favourite python: module load Python cd path/to/myproject # activate myproject's virtual environment: source .venv/bin/activate