-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split up particles over multiple tiles for OpenMP #862
base: development
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay, thank you for the fix! 🎉
Oh, CI indicates we did not initialize the right number of total particles yet (or forgot them in output)? |
Only the |
Yay! 🚀 ✨ Yes, I think it is actually not thread safe... Let's set those two controls to
|
Needs some further work, e.g., on testing GIL-free Python, to support threading of Python part.
8112176
to
be2027f
Compare
} | ||
|
||
if (n_logical < nthreads) { | ||
amrex::Abort("ImpactParticleContainer::prepare() " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atmyers I added more more concrete guidance to users here, can you please double check it?
In which situations would we not be able to find enough tiles? Is there other guidance we should give or can we make it a high
warning and "just" be less parallel (i.e., less parallely efficient)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the code will not just be efficient, it will give incorrect results if the number of tiles according to the tile_size
is fewer than the number of threads. The issue is that the copyParticles routine in AMReX does not copy the ones that aren't on a valid tile. This resulted in some of the particles not getting written out some examples (since we us copyParticles to a pinned_pc in IO).
This can be changed in AMReX, I'd suggest changing this to a warning rather than an Abort at that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I understand the logic way better now 🙏
@atmyers I see segfaults on Windows right now, which is one of the few runners compiling with Can you compile in the same mode please and run valgrind on the failing example(s)? I suspect there is an actual bug somewhere. |
I now get, on this example, the following timings for
track_particles
on 1, 2, and 4 threads:1.893
1.017
0.5939
Fix #847