MPI_TYPE_INDEXED + MPI_SEND/RECV slow with older infiniband network? #12209

chhu · 2024-01-03T12:47:05Z

Related to #12202 but without CUDA. On our shared-memory system (2xEPYC) MPI_TYPE_INDEXED works fast as expected, but as soon as our 40GBit Infiniband gets involved performance breaks down by a factor of 2-5. This does not happen with the same OMPI and linear buffers (arrays).

Speed and response time of IB is very high and working fine as expected.

I do not see this behavior on our big HPC system that has 100G IB, even with the same OMPI. Is there something I can tune? How does OMPI transmit indexed types? Single request per block or scatter/gather into linear array first?

Thanks!

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

Tested on 3.1, 4.1 and 5.1 latest

Please describe the system on which you are running

See #12202

brminich · 2024-01-16T08:45:14Z

Is performance impact of using MPI_TYPE_INDEXED on 100G IB HPC system negligible or just smaller than on 40G systems?
I'd expect it to be noticable on any system, as UCX does not use certain protocols when data is not contigious.

chhu · 2024-01-25T10:21:09Z

Only thing I can say is that on 100G IB the TYPE_INDEX has no notable impact, while on the 40G it has a major impact. Are you suggesting one should avoid non-contiguous data exchange?

brminich · 2024-01-25T12:33:18Z

yes, using non-contigious data may imply some limitations on mpi/ucx/network protocols

chhu · 2024-01-26T11:34:56Z

Hmm, maybe it would be a nice feature to linearize into a new buffer first before the exchange? Maybe let the user control this via a threshold setting?

jsquyres added the Target: v5.0.x label Jan 16, 2024

jsquyres added this to the v5.0.2 milestone Jan 16, 2024

jsquyres added the question label Jan 16, 2024

janjust added the State-Awaiting user information label Jan 23, 2024

jsquyres removed the State-Awaiting user information label Jan 30, 2024

jsquyres modified the milestones: v5.0.2, v5.0.3 Feb 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPI_TYPE_INDEXED + MPI_SEND/RECV slow with older infiniband network? #12209

MPI_TYPE_INDEXED + MPI_SEND/RECV slow with older infiniband network? #12209

chhu commented Jan 3, 2024

brminich commented Jan 16, 2024

chhu commented Jan 25, 2024

brminich commented Jan 25, 2024

chhu commented Jan 26, 2024

MPI_TYPE_INDEXED + MPI_SEND/RECV slow with older infiniband network? #12209

MPI_TYPE_INDEXED + MPI_SEND/RECV slow with older infiniband network? #12209

Comments

chhu commented Jan 3, 2024

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

Please describe the system on which you are running

brminich commented Jan 16, 2024

chhu commented Jan 25, 2024

brminich commented Jan 25, 2024

chhu commented Jan 26, 2024