Skip to content
This repository has been archived by the owner on Jun 8, 2023. It is now read-only.

Changes to support async execution on GPU with OpenACC #75

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/mod2c_core/nocpout.c
Original file line number Diff line number Diff line change
Expand Up @@ -1950,7 +1950,7 @@ void bablk(ba, type, q1, q2)
" _thread = _ml->_thread;\n"
" double * _nt_data = _nt->_data;\n"
" double * _vec_v = _nt->_actual_v;\n"
" int stream_id = _nt->stream_id;\n"
" int stream_id = _nt->streams[_nt->stream_id];\n"
" #if LAYOUT == 1 /*AoS*/\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iomaganaris : remind me better - MOD2C remains only OpenACC (and not OpenMP). So does this change required in MOD2C i.e. using _nt->stream_id as async launch stream for OpenACC works fine right? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, yes. I just changed this to be similar to the form needed by OpenMP for homogenity

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

" for (_iml = 0; _iml < _cntml_actual; ++_iml) {\n"
" _p = _ml->_data + _iml*_psize; _ppvar = _ml->_pdata + _iml*_ppsize;\n"
Expand Down Expand Up @@ -2975,7 +2975,7 @@ void emit_net_receive_buffering_code() {

sprintf(buf, "\
\n int _di;\
\n int stream_id = _nt->stream_id;\
\n int stream_id = _nt->streams[_nt->stream_id];\
\n Point_process* _pnt = _nt->pntprocs;\
\n int _pnt_length = _nt->n_pntproc - _nrb->_pnt_offset;\
\n int _displ_cnt = _nrb->_displ_cnt;\
Expand Down Expand Up @@ -3090,7 +3090,7 @@ static void emit_nrn_watch_check_code() {
" _thread = _ml->_thread;\n"
" double * _nt_data = _nt->_data;\n"
" double * _vec_v = _nt->_actual_v;\n"
" int stream_id = _nt->stream_id;\n"
" int stream_id = _nt->streams[_nt->stream_id];\n"
);
/* for gpu, performance may be better factored into below ITERATE */
Lappendstr(procfunc, "\n"
Expand Down