{compiler}[AOMP/11.12-0] Add AOMP compiler module #12018

nordmoen · 2021-01-18T13:05:12Z

This change adds an AOMP compiler module to EasyBuild.

The setup has been tested and is working on both AMD and Nvidia GPUs.

The patch is necessary to remove some hard coded behavior on RHEL8 and to get the build process to respect EasyBuild --parallel option. The reason for the older GCC version is to conform with what AOMP expects when building, it could probably work with a newer version, but I have not tried. The same goes for gcccuda, AOMP recommends CUDA version 10.

migueldiascosta · 2021-01-26T05:53:06Z

@nordmoen thank you for your contribution!

quick note while starting to test, this requires libelf if elfutils-libelf-devel or equivalent is not installed - you can point to the one in #12048 once that's merged or add the easyconfig to this PR

nordmoen · 2021-01-26T06:23:39Z

I'll add the dependency once it is merged =]

migueldiascosta · 2021-01-26T07:34:11Z

@boegelbot please test @ generoso

boegelbot · 2021-01-26T07:35:23Z

@migueldiascosta: Request for testing this PR well received on generoso

PR test command 'EB_PR=12018 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_12018 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

exit code: 0
output:

Submitted batch job 13705

Test results coming soon (I hope)...

- notification for comment with ID 767360274 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

boegelbot · 2021-01-26T11:18:17Z

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
generoso-x-5 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/2adfa2a67ca5e8f11b71644c929c1337 for a full test report.

migueldiascosta · 2021-01-26T14:25:57Z

Test report by @migueldiascosta
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in total)
sms - Linux centos linux 7.6.1810, x86_64, AMD EPYC 7601 32-Core Processor (zen), Python 2.7.5
See https://gist.github.com/29ae97e128926e9b446fe1983d3687fa for a full test report.

migueldiascosta · 2021-01-26T14:38:48Z

@nordmoen w.r.t. to my failed test above, it seems that at least libomptarget is including the system header files instead of the ones in GCCcore, and I suppose this isn't a problem for you or in generoso because those have more recent system headers (e.g., centos 8 vs centos 7)

In file included from /dev/shm/build/AOMP/11.12-0/GCCcore-9.3.0/aomp11/amd-llvm-project/openmp/libomptarget/src/device.cpp:13:
/dev/shm/build/AOMP/11.12-0/GCCcore-9.3.0/aomp11/amd-llvm-project/openmp/libomptarget/src/device.h:95:43: error: too few template arguments for class template 'less'
typedef std::set<HostDataToTargetTy, std::less<>> HostDataToTargetListTy;
                                          ^
/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/bits/stl_function.h:231:12: note: template is declared here
    struct less : public binary_function<_Tp, _Tp, bool>

nordmoen · 2021-01-27T08:21:39Z

@nordmoen w.r.t. to my failed test above, it seems that at least libomptarget is including the system header files instead of the ones in GCCcore, and I suppose this isn't a problem for you or in generoso because those have more recent system headers (e.g., centos 8 vs centos 7)
In file included from /dev/shm/build/AOMP/11.12-0/GCCcore-9.3.0/aomp11/amd-llvm-project/openmp/libomptarget/src/device.cpp:13:
/dev/shm/build/AOMP/11.12-0/GCCcore-9.3.0/aomp11/amd-llvm-project/openmp/libomptarget/src/device.h:95:43: error: too few template arguments for class template 'less'
typedef std::set<HostDataToTargetTy, std::less<>> HostDataToTargetListTy;
                                          ^
/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../include/c++/4.8.5/bits/stl_function.h:231:12: note: template is declared here
    struct less : public binary_function<_Tp, _Tp, bool>

I'm looking around in the build setup, but I can't really find anything where it should be hardcoded and not set correctly by loading GCCcore.

nordmoen · 2021-01-27T12:36:51Z

@migueldiascosta I discovered that additional build tools were needed, with one of them being pkg-config which could account for the wrong link (?). Could you test the new changes with your setup?

migueldiascosta · 2021-01-28T02:15:11Z

@nordmoen libomptarget is being built with the supplied clang, so the problem is likely this: #11868 (comment)...

Since we are not building clang (et al), I suppose we need a way of always passing --gcc-toolchain=$EBROOTGCCCORE to clang (et al) (@boegel, @SebastianAchilles ?)

SebastianAchilles · 2021-01-28T12:13:52Z

Yes, as far as I understood you have to use --gcc-toolchain=$EBROOTGCCCORE in order to use the correct gcccore from the easybuild installation. Otherwise the gcc from the system will be used (if found).

nordmoen · 2021-02-02T11:07:38Z

Looking through the build sources of AOMP it seems that several places need to change to have this take effect. We could of course do this in a patch, but maybe there is a better way?

migueldiascosta · 2021-02-02T22:15:33Z

maybe a compiler wrapper... @boegel ?

boegel · 2021-03-24T15:35:28Z

@nordmoen My apologies for never replying to your hail for help/input, this got totally snowed under...

We can definitely put lightweight wrappers in place to make sure the right GCC installation is picked up, if there's no better alternative, but then we'll need to implement a small easyblock for this I guess.
It's a bit frustrating that Clang doesn't just pick up the first gcc it finds in $PATH...

That said, is there any particular reason why we're not building AOMP from source, and configure the build with -DGCC_INSTALL_PREFIX (cfr. @SebastianAchilles' comment in #11868 (comment)).

edit: At 2nd look, maybe this is a from-source build, but then why aren't we using the procedure outlined at https://github.com/ROCm-Developer-Tools/aomp/blob/master/docs/RELEASESOURCEINSTALL.md?

Maybe it's better to discuss this interactively in EasyBuild Slack?

Flamefire · 2021-03-24T16:08:55Z

easybuild/easyconfigs/a/AOMP/AOMP-11.12-0-gcccuda-2019b.eb

+# support, e.g. '50,75'.
+preinstallopts += ' AOMP_BUILD_CUDA=1 '
+preinstallopts += 'CUDA="$CUDA_ROOT" '
+preinstallopts += 'NVPTXGPUS="70"'


I think this should be configurable. Does it support 7.0 or sm_70 format? Then you can use the templates cuda_compute_capabilities etc.

Based on the pre-configured targets, I assumed one had to use whole numbers only.

Hm, then I'd say an EasyBlock is better here. I'm afraid short of patching the EC (e.g. via hooks) this would limit the GPU usability unusually, especially after we introduced the --cuda-compute-capabilities option
This has the additional benefit that those hard-coded, repeated preinstallopts can be unified and handled better (e.g. detecting presence of CUDA)

nordmoen · 2021-04-06T06:31:52Z

It seems that AOMP is deprecated in favor of built in support in ROCm (as per the official documentation).

Because of this I don't see the value in adding this package, especially when it seems like there is quite a lot work to get it set up properly.

nordmoen · 2021-04-21T08:58:33Z

Just had a meeting with AMD where they confirmed that AOMP is the official way to use OpenMP directives with AMD GPUs.

This is an updated version of the PR easybuilders#12018. This PR updates the version to `13.0-2`. In addition this PR adds an improved patch which should force the built modules to pick up the EasyBuild installed version of `gcc`. This module requires the `aomp.py` EasyBlock to build.

nordmoen · 2021-05-26T06:51:32Z

Closing in favor of #12909

Jørgen Nordmoen added 3 commits December 22, 2020 12:32

Added working AOMP compiler tested on CUDA

e17e95b

Tested AMD only version of AOMP

86a6865

Fixed thread setting in patch for ARM

f2b0cab

nordmoen changed the title ~~Add AOMP compiler module~~ {compiler}[AOMP/11.12-0] Add AOMP compiler module Jan 18, 2021

Renamed EB files

3a3be90

migueldiascosta added the new label Jan 26, 2021

migueldiascosta added this to the 4.x milestone Jan 26, 2021

Added missing build dependencies

92ae1c0

Flamefire reviewed Mar 24, 2021

View reviewed changes

nordmoen closed this Apr 6, 2021

nordmoen reopened this Apr 21, 2021

nordmoen mentioned this pull request May 20, 2021

{compiler}[GCCcore/10.2.0,gcccuda/2020b] AOMP v13.0.2 #12909

Merged

nordmoen closed this May 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

{compiler}[AOMP/11.12-0] Add AOMP compiler module #12018

{compiler}[AOMP/11.12-0] Add AOMP compiler module #12018

nordmoen commented Jan 18, 2021

migueldiascosta commented Jan 26, 2021

nordmoen commented Jan 26, 2021

migueldiascosta commented Jan 26, 2021

boegelbot commented Jan 26, 2021

boegelbot commented Jan 26, 2021

migueldiascosta commented Jan 26, 2021

migueldiascosta commented Jan 26, 2021

nordmoen commented Jan 27, 2021

nordmoen commented Jan 27, 2021

migueldiascosta commented Jan 28, 2021 •

edited

Loading

SebastianAchilles commented Jan 28, 2021

nordmoen commented Feb 2, 2021

migueldiascosta commented Feb 2, 2021

boegel commented Mar 24, 2021 •

edited

Loading

Flamefire Mar 24, 2021

nordmoen Mar 25, 2021

Flamefire Mar 25, 2021

nordmoen commented Apr 6, 2021

nordmoen commented Apr 21, 2021

nordmoen commented May 26, 2021

{compiler}[AOMP/11.12-0] Add AOMP compiler module #12018

{compiler}[AOMP/11.12-0] Add AOMP compiler module #12018

Conversation

nordmoen commented Jan 18, 2021

migueldiascosta commented Jan 26, 2021

nordmoen commented Jan 26, 2021

migueldiascosta commented Jan 26, 2021

boegelbot commented Jan 26, 2021

boegelbot commented Jan 26, 2021

migueldiascosta commented Jan 26, 2021

migueldiascosta commented Jan 26, 2021

nordmoen commented Jan 27, 2021

nordmoen commented Jan 27, 2021

migueldiascosta commented Jan 28, 2021 • edited Loading

SebastianAchilles commented Jan 28, 2021

nordmoen commented Feb 2, 2021

migueldiascosta commented Feb 2, 2021

boegel commented Mar 24, 2021 • edited Loading

Flamefire Mar 24, 2021

Choose a reason for hiding this comment

nordmoen Mar 25, 2021

Choose a reason for hiding this comment

Flamefire Mar 25, 2021

Choose a reason for hiding this comment

nordmoen commented Apr 6, 2021

nordmoen commented Apr 21, 2021

nordmoen commented May 26, 2021

migueldiascosta commented Jan 28, 2021 •

edited

Loading

boegel commented Mar 24, 2021 •

edited

Loading