Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeNN while using along with Brian2genn fails on Power9 #362

Closed
k18shruti opened this issue Sep 26, 2020 · 9 comments · Fixed by #363
Closed

GeNN while using along with Brian2genn fails on Power9 #362

k18shruti opened this issue Sep 26, 2020 · 9 comments · Fixed by #363

Comments

@k18shruti
Copy link

Hi,
I am trying to compile GeNN for Brian2GeNN in order to run on the Summit Supercomputer with Power9 PC.
I saw that the utility library expects the PATH_MAX variable to be defined,

genn/include/genn/third_party/path.h:80:19: error: 'PATH_MAX' was not declared in this scope
char temp[PATH_MAX];

I added a #define PATH_MAX 4096.

But now I see the following while building the genn executable:
mkdir -p /gpfs/alpine/csc382/proj-shared/shruti/framework/examples/snn-simulation-evaluation/GeNNworkspace
/sw/summit/ibm-wml-ce/anaconda-base/envs/ibm-wml-ce-1.7.0-3/bin/powerpc64le-conda_cos7-linux-gnu-c++ -std=c++11 -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -mcpu=power8 -mtune=power8 -mpower8-fusion -mpower8-vector -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -Wall -Wpedantic -Wextra -MMD -MP -I/gpfs/alpine/csc382/proj-shared/shruti/genn/include/genn/genn -I/gpfs/alpine/csc382/proj-shared/shruti/genn/include/genn/third_party -I/gpfs/alpine/csc382/proj-shared/shruti/framework/examples/snn-simulation-evaluation -I/gpfs/alpine/csc382/proj-shared/shruti/framework/examples/snn-simulation-evaluation/GeNNworkspace -I/gpfs/alpine/csc382/proj-shared/shruti/framework/examples/snn-simulation-evaluation/GeNNworkspace/brianlib/randomkit -I/gpfs/alpine/csc382/proj-shared/shruti/genn/include/genn/backends/cuda -DMODEL="/gpfs/alpine/csc382/proj-shared/shruti/framework/examples/snn-simulation-evaluation/GeNNworkspace/magicnetwork_model.cpp" -DBACKEND_NAMESPACE=CUDA -I"/sw/summit/cuda/10.1.243/include" generator.cc -o /gpfs/alpine/csc382/proj-shared/shruti/framework/examples/snn-simulation-evaluation/GeNNworkspace/generator -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -L/gpfs/alpine/csc382/proj-shared/shruti/genn/lib -lgenn_cuda_backend -lgenn -L"/sw/summit/cuda/10.1.243/lib64" -lcuda -lcudart
make: Leaving directory '/gpfs/alpine/csc382/proj-shared/shruti/genn/src/genn/generator'
genn-buildmodel.sh:86: error 50: command failure

/autofs/nccs-svm1_sw/summit/ibm-wml-ce/anaconda-base/envs/ibm-wml-ce-1.7.0-3/bin/../lib/gcc/powerpc64le-conda_cos7-linux-gnu/7.3.0/../../../../powerpc64le-conda_cos7-linux-gnu/bin/ld: cannot find -lcuda
collect2: error: ld returned 1 exit status
make: *** [MakefileCommon:41: /gpfs/alpine/csc382/proj-shared/shruti/framework/examples/snn-simulation-evaluation/GeNNworkspace/generator] Error 1

Could you please let me know if there is a fix for this?

The OS is RHEL 7.6, with gcc version 4.8.5. Do let me know if you need more details on this.
Thanks.

@tnowotny
Copy link
Member

Hum .. it looks like the linker can't locate the cuda library. Can you establish where the correct cuda library is located on your system? The current command line is instructing the linker to look in /sw/summit/cuda/10.1.243/lib64

@neworderofjamie
Copy link
Contributor

The PATH_MAX issue is very odd - we will investigate! Further from what @tnowotny said, I think the issue is that you're using the version of GCC bundled with Anaconda rather than your system's version. From our past experiance this does not work with CUDA. However, GeNN requires a minimum of GCC 4.9.2 so you may need to load a newer version via modules or whatever the equivalent is on Summit.

@neworderofjamie
Copy link
Contributor

To help diagnose the PATH_MAX issue, could you post the output of running:

touch test.h && cpp -dM test.h | grep linux && rm test.h

@qhaas
Copy link

qhaas commented Sep 28, 2020

Attempting to replicate on Summit as well with the latest release version of genn-4.3.3...

$ module load ibm-wml-ce/1.6.2-5 gcc
$ cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.6 (Maipo)
$ uname -m
ppc64le
$ gcc --version
gcc (GCC) 6.4.0
...
$ python --version
Python 3.6.10 :: Anaconda, Inc.
$ nvcc --version | grep release
Cuda compilation tools, release 10.1, V10.1.243
$ export CUDA_PATH=$CUDAPATH
$ echo $CUDA_PATH
/sw/summit/cuda/10.1.243
$ touch test.h && cpp -dM test.h | grep linux && rm test.h
#define __linux 1
#define __linux__ 1
#define __gnu_linux__ 1
#define linux 1
$ wget https://github.com/genn-team/genn/archive/4.3.3.tar.gz
$ tar -zxf 4.3.3.tar.gz
$ cd genn-4.3.3
$ python setup.py bdist_wheel
...
swig error : Unrecognized option -relativeimport
Use 'swig -help' for available options.
error: command 'swig' failed with exit status 1
$ yum info swig | grep Version
Version     : 2.0.10

My guess is that the version of swig deployed is significantly older than the one expected by genn 4.3.3

@neworderofjamie
Copy link
Contributor

To be clear, is that the output from summit or summit-dev? Either way, I can see no reason why GeNN wouldn't work on this system. Could you try running one of GeNN's own example projects e.g. https://github.com/genn-team/genn/tree/master/userproject/PotjansMicrocircuit_project? If that works, there should be no problem with Brian2GeNN

One thing to note is that, slightly-confusingly, Brian2GeNN does not require GeNN's python interface so doesn't need SWIG!

@qhaas
Copy link

qhaas commented Sep 28, 2020

To be clear, is that the output from summit or summit-dev

Summit, specifically, a login node. Didn't realize people were still using summit-dev

Brian2GeNN does not require GeNN's python interface so doesn't need SWIG!

Good to know, but just for the sake of completeness, upgrading to the latest release version of SWIG (4.0.x) clears the swig error, per the swig documentation:

$ wget https://github.com/swig/swig/archive/v4.0.2.tar.gz
$ tar -xf v4.0.2.tar.gz
$ cd swig-4.0.2
$ install -d ${HOME]/swig
$ ./autogen.sh
$ ./configure --prefix=${HOME}/swig
$ make -j$(nproc)
$ install -d ${HOME]/swig
$ make install
$ export PATH=${HOME}/swig/bin:${PATH}

Now, back to GeNN...

$ install -d ~/genn
$ make PREFIX=${HOME}/genn -j $(nproc)
...
In file included from code_generator/generateAll.cc:12:0:
include/genn/third_party/path.h: In member function 'filesystem::path filesystem::path::make_absolute() const':
include/genn/third_party/path.h:80:19: error: 'PATH_MAX' was not declared in this scope
         char temp[PATH_MAX];
                   ^~~~~~~~
include/genn/third_party/path.h:81:37: error: 'temp' was not declared in this scope
         if (realpath(str().c_str(), temp) == NULL)

I haven't isolated the reason why defined(__linux) is 0 in 'include/genn/third_party/path.h' yet, but forcing the include of <linux/limits.h> clears the issue:

$ make PREFIX=${HOME}/genn
$ make PREFIX=${HOME}/genn install
...
$ ls ~/genn/bin
genn-buildmodel.sh  genn-create-user-project.sh
$ ls ~/genn/lib
libgenn.a  libgenn_cuda_backend.a  libgenn_single_threaded_cpu_backend.a

@neworderofjamie
Copy link
Contributor

That sounds positive - I would be interested to know if defined(__linux__) works any better.

@qhaas
Copy link

qhaas commented Sep 28, 2020

I would be interested to know if defined(__linux__) works any better.

It does, if I replace __linux with __linux__, it builds without the forcing of the include. Thanks!

@neworderofjamie
Copy link
Contributor

Good to know! I'll fix that right away but will leave this open in case you have further issues running Brian2GeNN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants