-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
treematch hang #1183
Comments
@ggouaillardet a quick look at the test seems to indicate that the test is broken. On my machine the test does not deadlock or segfault, but instead it triggers a nice MPI abort due to a wrong argument in MPI_Dist_graph_create. If I look how this function is called, for cnt = 9 I have:
The last destination is clearly wrong (larger than the number of nodes), but should be valid as the total number of degrees is 11. |
@bosilca goot catch ! |
@bosilca i pushed open-mpi/ompi-tests@0c77d1ba7b65896a84afba0405e2477dcbaa2875 in order to fix the issue you identified. here is a
hang occurs with 4 MPI tasks on a 4 cores box, or on a 16 cores box if if you run with a possible fix is to here is a patch for that (it does include some code factorization) diff --git a/ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c b/ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c
index 6c31d1f..0b2b249 100644
--- a/ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c
+++ b/ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c
@@ -389,30 +389,28 @@ int mca_topo_treematch_dist_graph_create(mca_topo_base_module_t* topo_module,
fprintf(stderr,"========== Centralized Reordering ========= \n");
local_pattern = (double *)calloc(size*size,sizeof(double));
- if( true == topo->weighted ) {
- for(i = 0; i < topo->indegree ; i++)
- local_pattern[topo->in[i]] += topo->inw[i];
- for(i = 0; i < topo->outdegree ; i++)
- local_pattern[topo->out[i]] += topo->outw[i];
- if (OMPI_SUCCESS != (err = comm_old->c_coll.coll_gather(MPI_IN_PLACE, size, MPI_DOUBLE,
- local_pattern, size, MPI_DOUBLE,
- 0, comm_old,
- comm_old->c_coll.coll_gather_module)))
- return err;
- }
} else {
local_pattern = (double *)calloc(size,sizeof(double));
- if( true == topo->weighted ) {
- for(i = 0; i < topo->indegree ; i++)
- local_pattern[topo->in[i]] += topo->inw[i];
- for(i = 0; i < topo->outdegree ; i++)
- local_pattern[topo->out[i]] += topo->outw[i];
- if (OMPI_SUCCESS != (err = comm_old->c_coll.coll_gather(local_pattern, size, MPI_DOUBLE,
- NULL,0,0,
- 0, comm_old,
- comm_old->c_coll.coll_gather_module)))
- return err;
- }
+ }
+ if( true == topo->weighted ) {
+ for(i = 0; i < topo->indegree ; i++)
+ local_pattern[topo->in[i]] += topo->inw[i];
+ for(i = 0; i < topo->outdegree ; i++)
+ local_pattern[topo->out[i]] += topo->outw[i];
+ }
+ if(0 == rank) {
+ err = comm_old->c_coll.coll_gather(MPI_IN_PLACE, size, MPI_DOUBLE,
+ local_pattern, size, MPI_DOUBLE,
+ 0, comm_old,
+ comm_old->c_coll.coll_gather_module);
+ } else {
+ err = comm_old->c_coll.coll_gather(local_pattern, size, MPI_DOUBLE,
+ NULL,0,0,
+ 0, comm_old,
+ comm_old->c_coll.coll_gather_module);
+ }
+ if (OMPI_SUCCESS != err) {
+ return err;
}
if( rank == local_procs[0]) { if the fix looks good to you, i guess a similar one should be applied around line 731 (Partially Distributed Reordering) note that when running with |
@bosilca diff --git a/ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c b/ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c
index 6c31d1f..d97b017 100644
--- a/ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c
+++ b/ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c
@@ -386,7 +386,9 @@ int mca_topo_treematch_dist_graph_create(mca_topo_base_module_t* topo_module,
*/
if(0 == rank) {
+#ifdef __DEBUG__
fprintf(stderr,"========== Centralized Reordering ========= \n");
+#endif
local_pattern = (double *)calloc(size*size,sizeof(double));
if( true == topo->weighted ) {
diff --git a/ompi/mca/topo/treematch/treematch/tm_kpartitioning.c b/ompi/mca/topo/treematch/treematch/tm_kpartitioning.c
index 3aaed6a..017f3ed 100644
--- a/ompi/mca/topo/treematch/treematch/tm_kpartitioning.c
+++ b/ompi/mca/topo/treematch/treematch/tm_kpartitioning.c
@@ -426,8 +426,7 @@ tree_t *kpartition_build_tree_from_topology(tm_topology_t *topology,double **com
verbose_level = get_verbose_level();
if(verbose_level>=INFO)
- printf("Number of constraints: %d\n", nb_constraints);
- printf("Number of constraints: %d, N=%d\n", nb_constraints, N);
+ printf("Number of constraints: %d, N=%d\n", nb_constraints, N);
nb_cores=nb_processing_units(topology);
diff --git a/ompi/mca/topo/treematch/treematch/tm_tree.c b/ompi/mca/topo/treematch/treematch/tm_tree.c
index 0f41958..3305e2d 100644
--- a/ompi/mca/topo/treematch/treematch/tm_tree.c
+++ b/ompi/mca/topo/treematch/treematch/tm_tree.c
@@ -1611,7 +1611,8 @@ tree_t * build_tree_from_topology(tm_topology_t *topology, double **com_mat, int
nb_constraints = check_constraints (topology, &constraints);
- printf("nb_constraints = %d, N= %d; nb_processing units = %d\n",nb_constraints, N, nb_processing_units(topology));
+ if(verbose_level>=INFO)
+ printf("nb_constraints = %d, N= %d; nb_processing units = %d\n",nb_constraints, N, nb_processing_units(topology));
if(N>nb_constraints){
if(verbose_level >= CRITICAL){ btw, any reason why you use |
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
The efforts to fix this issue unfold in https://github.com/bosilca/ompi/tree/topic/treematch |
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
mtl/ofi: Change default provider selection behavior.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
@bosilca will you be opening a PR for master to fix this in the near future (like this week)? |
Let me check the status of the branch and I'll get back to you. |
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183.
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183. Signed-off-by: George Bosilca <[email protected]>
simply based on some local state. This is the second part of the patch proposed for open-mpi#1183. Signed-off-by: George Bosilca <[email protected]>
@bosilca the distgraph_test_4 test from the ibm test suite might hang depending on the machine topology.
with 4 mpi tasks, it works fine on a server with 2 sockets / 12 cores / 24 threads, but it hangs on my VM with 1 socket / 4 cores / 4 threads
the inlined topology can be used to evidence the issue
run as is
run with a simple topology
but this works fine without the treematch module
note i had to push a5440ad so the topology file is used by the treematch module (otherwise, hwloc use the topology file, but treematch use the topology of the node)
can you please have a look at this ?
The text was updated successfully, but these errors were encountered: