-
Notifications
You must be signed in to change notification settings - Fork 41
Conversation
This reverts commit e22621e.
Waiting better days with neuron / corneuron
Can one of the admins verify this patch? |
{ | ||
MPI_Comm_rank(comm, &mpi_rank); | ||
if (mpi_rank == server_rank) { | ||
server_thr = std::thread([&](){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few quick comments:
- I don't think we want to spawn the thread (performance & pinning aspects).
- another aspect is mixing two threading models - OpenMP in CoreNEURON and std::thread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can afford 1 thread in 1 rank before doing hours of computation.
Is it a problem as in this thread there is no OpenMP and that in OpenMP part there is no thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(now I see where the slack discussion came from) I guess it's clear now that OpenMP is threading. And I sort of agree with @pramodk that here the thread is not needed. It seems premature optimization and adding unnecessary complications to want to run the logging in its separate thread. I would say for now let's just skip threading.
coreneuron/mpi/spdlog_mpi_logger.h
Outdated
MPI_Probe(MPI_ANY_SOURCE, tag, comm, &status); | ||
int msg_size = 0; | ||
MPI_Get_count(&status, MPI_BYTE, &msg_size); | ||
auto buf = std::make_unique<char[]>(msg_size); | ||
MPI_Recv(buf.get(), msg_size, MPI_PACKED, MPI_ANY_SOURCE, tag, comm, MPI_STATUS_IGNORE); | ||
|
||
int position = 0; | ||
int level = 0; | ||
MPI_Unpack(buf.get(), msg_size, &position, &level, 1, MPI_INT, comm); | ||
unsigned long payload_size = 0; | ||
MPI_Unpack(buf.get(), msg_size, &position, &payload_size, 1, MPI_UNSIGNED_LONG, comm); | ||
std::string payload(payload_size, ' '); | ||
MPI_Unpack(buf.get(), msg_size, &position, &payload[0], payload_size, MPI_BYTE, comm); | ||
std::cout << "[" << level << "] " << payload << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- MPI_Probe, MPI_Recv are blocking calls
- It's true that these are running under separate thread and hence argument could be they won't block the main thread
- then, note that this requires enabling thread safety in MPI library (which comes with the cost)
- I wouldn't claim every MPI library is fully thread safe (e.g. for
MPI_THREAD_MULTIPLE
and related performance penalty) - (I understand that collecting logs at
server_rank
is onepolicy
but this won't be scalable.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I made them blocking because this thread has nothing else to do.
hm, I will look for MPI thread safety, but it seems me unrelated there.
why, is it not scallable? do you have an idea on an other policy?
{ | ||
public: | ||
// Empty logger | ||
explicit mpi_logger(std::string logger_name, int server_rank_ = 0, MPI_Comm comm_ = MPI_COMM_WORLD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean by "server rank"?
server_thr = std::thread([&](){ | ||
int stopped = 1; | ||
MPI_Finalized(&stopped); | ||
while(!stopped) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see how you thought about it. May I suggest an alternate way do this?
The more MPI way of doing parallelism is that all ranks execute essentially the same program. This would mean that in a given piece of code where you'd want to log, all ranks would call mpi_logger.log(...)
(or something similar). the log
function could then do two different things:
- You could go the serialization route: You write a loop over the ranks and serialize with barriers the output, printing at each iteration the log of one rank and protecting it with a barrier. This way you ensure that the ranks don't write over each other, you can then also add fancy things like masks for ranks to print or mute, etc.
- you could go the
MPI_Gather
route: You first gather (potential) logs from all ranks to rank 0 and print from there. This could be possibly the more efficient way of doing it, but I'm not sure right now.
I'm sure @pramodk has many improvements over my ideas, but I think generally this will be the better approach then the way method here. By the way, the method you use also makes perfectly sense and I've seen it (maybe implemented it in a previous life) in client-server architectures.
Final question for discussion: Would the more MPI way of doing it work with spdlog? I don't see immediately why not, but I haven't thought very hard about this.
a68d38e
to
1e89fc5
Compare
Should define it more and rewrite it. |
No description provided.