Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan merge of non-blocking collectives branch (nbc) into master #590

Open
4 tasks done
rupertnash opened this issue Oct 20, 2015 · 13 comments
Open
4 tasks done

Plan merge of non-blocking collectives branch (nbc) into master #590

rupertnash opened this issue Oct 20, 2015 · 13 comments
Assignees

Comments

@rupertnash
Copy link
Member

The work I did as part of CRESTA to use MPI 3.0 NBCs instead of the PhasedBroadcast machinery should be reexamined and merged into master.

Things to do before merge

  • Update the branch
  • Switch from MPI_Wait to MPI_Test for the monitoring parts
  • Make it an optional feature in CMake or decide to make it compulsory (my preferred option)
  • Tidy the phased broadcast stuff

@mobernabeu: I would appreciate you opinion on point 3

@mdavezac / @garymacindoe: I would appreciate your comments on any problems this may cause for the RBC branch merge.

@mdavezac
Copy link

@rupertnash, it might be best to open a pull-request for that branch. That makes it easier to see what the changes are, whether the test pass etc.

@rupertnash rupertnash self-assigned this Oct 20, 2015
@mdavezac mdavezac mentioned this issue Oct 20, 2015
Closed
4 tasks
@mobernabeu
Copy link
Contributor

I think it's great that we finally got around to merge these improvements, @rupertnash.

WRT 3, I agree with you. I would go as far as deleting the PhasedBroadcast infrastructure all together. I don't see much value in maintaining both implementations.

@rupertnash
Copy link
Member Author

@mobernabeu good! The only sad thing is that we can't entirely remove the horror because of the visualisation stuff. That can't be replaced by an MPI collective because the sizes of inputs and output of the reduction function are not the same.

And we're definitely OK with requiring MPI 3.0? I think it's reasonable.

@mobernabeu
Copy link
Contributor

What a great reason to deprecate the runtime visualisation stuff, isn't it? 😄 I would be happy to put that code in a separate branch and move on without it.

I remember there were some attempts at decoupling runtime visualisation from HemeLB during CRESTA. Any comments @djgroen?

I'm fine with moving to MPI 3.0. Is it safe to assume that most supercomputers currently support this version of the standard?

@rupertnash
Copy link
Member Author

I have just had an idea: I can propose a data science MSc project to sort out the vis for HemeLB. Rip out the hand made stuff and put something real in (e.g. VTK/Paraview/the DLR thing). And then get rid of it.

@mobernabeu
Copy link
Contributor

👍

@mdavezac
Copy link

Since, I was about to start some MPI, I figured I would try and see I could get the nbc branch to work with master.

So nbc_rebase is a rebase of this branch onto master (I think rebasing will make it easier to then merge nbc with the red-blood-cell branch). The tests pass. But regression still segfaults when run with more than one processor.

I may or may not have time to figure out what is going wrong, but the rebase is there if you want to use it, @rupertnash.

@rupertnash
Copy link
Member Author

Thanks, I started looking into this. There was a problem with the debugger which I've fixed on this branch.

I guess the PR #591 is now defunct?

@rupertnash
Copy link
Member Author

I've also chased up the OpenMPI bug. It is fixed on Github and will be released in v 1.10.1

This commit is the fix from which you may be able to patch your install:
open-mpi/ompi@c99a8a5

@rupertnash
Copy link
Member Author

Hi chaps,

So I just had a think about this and have the following design for the WhateverTesters, apropos of my point 2 above (Switch from MPI_Wait to MPI_Test for the monitoring parts). I'd appreciate your thoughts on this before I implement. The table shows the actions for each Step of a Phase depending on whether a collective is in flight or not at that moment.

Step running = false running = true
BeginAll Start here
BeginPhase / RequestComms NOP NOP
PreSend Check local stability NOP
Send MPI_Iallreduce - set running = true NOP
PreWait / PreRecv NOP NOP
Wait NOP MPI_Test; if done set running = false
EndPhase / PostReceive SetState NOP

The SteeringComponent is trickier because ideally it should only execute on demand, but I don't really know how MPI feels about having all the ranks except the steering one waiting for the collective to run. I'm going to talk to one of the Epigram folks about this. My current position is that I'll refactor the current CollectiveAction into a waiting and a testing version.

@rupertnash
Copy link
Member Author

As discussed on slack, point 4 is to be resolved by removing the vis and hence the need for the tangle of horrors that is the phased broadcast

@rupertnash
Copy link
Member Author

With these decisions this ticket can be closed

@rupertnash
Copy link
Member Author

I was just about to submit a PR, but I noticed that there's an odd performance issue: the diffTest simulation is taking a few seconds to run as normal, but over 20 s to shut down. There's clearly something odd going on. I will investigate next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants