Remove major/minor from renumber_edgelist public functions. (#2116)

Partially addresses #2003 We should not use major,minor (instead of src, dst) in the public non-detail API and should use major,minor in the detail space. The public non-detail namespace renumber_edgelist currently uses major, minor and this PR fixes this. This PR also replaces row/col to src/dst in `single_gpu_renumber_edgelist_given_number_map` as we're aiming to consistently use src/dst instead of row/col in our public C++ API. This is possibly breaking as this PR changes the public API (but I am not aware of any non-cugraph libraries directly calling renumber_edgelist). Authors: - Seunghwa Kang (https://github.com/seunghwak) Approvers: - Chuck Hastings (https://github.com/ChuckHastings) - Brad Rees (https://github.com/BradReesWork) - Joseph Nke (https://github.com/jnke2016) - Rick Ratzel (https://github.com/rlratzel) URL: #2116
rapidsai · Mar 22, 2022 · e61dac7 · e61dac7
1 parent 6c3a469
commit e61dac7
Show file tree

Hide file tree

Showing 12 changed files with 186 additions and 123 deletions.
diff --git a/cpp/include/cugraph/graph_functions.hpp b/cpp/include/cugraph/graph_functions.hpp
@@ -47,8 +47,9 @@ struct renumber_meta_t<vertex_t, edge_t, multi_gpu, std::enable_if_t<!multi_gpu>
 /**
  * @brief renumber edgelist (multi-GPU)
  *
- * This function assumes that vertices and edges are pre-shuffled to their target processes using
- * the compute_gpu_id_from_vertex_t & compute_gpu_id_from_edge_t functors, respectively.
+ * This function assumes that vertices are pre-shuffled to their target processes and edges are
+ * pre-shuffled to their target processess and edge partitions using compute_gpu_id_from_vertex_t
+ * and compute_gpu_id_from_edge_t & compute_partition_id_from_edge_t functors, respectively.
  *
  * @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
  * @tparam edge_t Type of edge identifiers. Needs to be an integral type.
@@ -60,28 +61,26 @@ struct renumber_meta_t<vertex_t, edge_t, multi_gpu, std::enable_if_t<!multi_gpu>
  * This parameter can be used to include isolated vertices. Applying the
  * compute_gpu_id_from_vertex_t to every vertex should return the local GPU ID for this function to
  * work (vertices should be pre-shuffled).
- * @param edgelist_majors Pointers (one pointer per local graph adjacency matrix partition assigned
- * to this process) to edge source vertex IDs (if the graph adjacency matrix is stored as is) or
- * edge destination vertex IDs (if the transposed graph adjacency matrix is stored). Vertex IDs are
- * updated in-place ([INOUT] parameter). Edges should be pre-shuffled to their final target process
- * & matrix partition; i.e. applying the compute_gpu_id_from_edge_t functor to every (major, minor)
- * pair should return the GPU ID of this process and applying the compute_partition_id_from_edge_t
- * fuctor to every (major, minor) pair for a local matrix partition should return the partition ID
- * of the corresponding matrix partition.
- * @param edgelist_minors Pointers (one pointer per local graph adjacency matrix partition assigned
- * to this process) to edge destination vertex IDs (if the graph adjacency matrix is stored as is)
- * or edge source vertex IDs (if the transposed graph adjacency matrix is stored). Vertex IDs are
- * updated in-place ([INOUT] parameter). Edges should be pre-shuffled to their final target process
- * & matrix partition; i.e. applying the compute_gpu_id_from_edge_t functor to every (major, minor)
- * pair should return the GPU ID of this process and applying the compute_partition_id_from_edge_t
- * fuctor to every (major, minor) pair for a local matrix partition should return the partition ID
- * of the corresponding matrix partition.
- * @param edgelist_edge_counts Edge counts (one count per local graph adjacency matrix partition
- * assigned to this process).
+ * @param edgelist_srcs Pointers (one pointer per local edge partition assigned to this process) to
+ * edge source vertex IDs. Source IDs are updated in-place ([INOUT] parameter). Applying the
+ * compute_gpu_id_from_edge_t functor to every (destination ID, source ID) pair (if store_transposed
+ * = true) or (source ID, destination ID) pair (if store_transposed = false) should return the local
+ * GPU ID for this function to work (edges should be pre-shuffled). Applying the
+ * compute_partition_id_from_edge_t to every (destination ID, source ID) pair (if store_transposed =
+ * true) or (source ID, destination ID) pair (if store_transposed = false) should also return the
+ * corresponding edge partition ID. The best way to enforce this is to use
+ * shuffle_edgelist_by_gpu_id & groupby_and_count_edgelist_by_local_partition_id.
+ * @param edgelist_dsts Pointers (one pointer per local edge partition assigned to this process) to
+ * edge destination vertex IDs. Destination IDs are updated in-place ([INOUT] parameter).
+ * @param edgelist_edge_counts Edge counts (one count per local edge partition assigned to this
+ * process).
  * @param edgelist_intra_partition_segment_offsets If valid, store segment offsets within a local
  * graph adjacency matrix partition; a local partition can be further segmented by applying the
  * compute_gpu_id_from_vertex_t function to edge minor vertex IDs. This optinoal information is used
  * for further memory footprint optimization if provided.
+ * @param store_transposed Should be true if renumbered edges will be used to create a graph with
+ * store_transposed = true. Should be false if the edges will be used to create a graph with
+ * store_transposed = false.
  * @param do_expensive_check A flag to run expensive checks for input arguments (if set to `true`).
  * @return std::tuple<rmm::device_uvector<vertex_t>, renumber_meta_t<vertex_t, edge_t, multi_gpu>>
  * Tuple of labels (vertex IDs before renumbering) for the entire set of vertices (assigned to this
@@ -98,10 +97,11 @@ std::enable_if_t<
 renumber_edgelist(
   raft::handle_t const& handle,
   std::optional<rmm::device_uvector<vertex_t>>&& local_vertices,
-  std::vector<vertex_t*> const& edgelist_majors /* [INOUT] */,
-  std::vector<vertex_t*> const& edgelist_minors /* [INOUT] */,
+  std::vector<vertex_t*> const& edgelist_srcs /* [INOUT] */,
+  std::vector<vertex_t*> const& edgelist_dsts /* [INOUT] */,
   std::vector<edge_t> const& edgelist_edge_counts,
   std::optional<std::vector<std::vector<edge_t>>> const& edgelist_intra_partition_segment_offsets,
+  bool store_transposed,
   bool do_expensive_check = false);
 
 /**
@@ -115,13 +115,14 @@ renumber_edgelist(
  * handles to various CUDA libraries) to run graph algorithms.
  * @param vertices If valid, vertices in the graph to be renumbered. This parameter can be used to
  * include isolated vertices.
- * @param edgelist_majors Edge source vertex IDs (if the graph adjacency matrix is stored as is) or
- * edge destination vertex IDs (if the transposed graph adjacency matrix is stored). Vertex IDs are
- * updated in-place ([INOUT] parameter).
- * @param edgelist_minors Edge destination vertex IDs (if the graph adjacency matrix is stored as
- * is) or edge source vertex IDs (if the transposed graph adjacency matrix is stored). Vertex IDs
- * are updated in-place ([INOUT] parameter).
+ * @param edgelist_srcs A pointer to edge source vertex IDs. Source IDs are updated in-place
+ * ([INOUT] parameter).
+ * @param edgelist_dsts A pointer to edge destination vertex IDs. Destination IDs are updated
+ * in-place ([INOUT] parameter).
  * @param num_edgelist_edges Number of edges in the edgelist.
+ * @param store_transposed Should be true if renumbered edges will be used to create a graph with
+ * store_transposed = true. Should be false if the edges will be used to create a graph with
+ * store_transposed = false.
  * @param do_expensive_check A flag to run expensive checks for input arguments (if set to `true`).
  * @return std::tuple<rmm::device_uvector<vertex_t>, renumber_meta_t<vertex_t, edge_t, multi_gpu>>
  * Tuple of labels (vertex IDs before renumbering) for the entire set of vertices and meta-data
@@ -135,9 +136,10 @@ std::enable_if_t<
   std::tuple<rmm::device_uvector<vertex_t>, renumber_meta_t<vertex_t, edge_t, multi_gpu>>>
 renumber_edgelist(raft::handle_t const& handle,
                   std::optional<rmm::device_uvector<vertex_t>>&& vertices,
-                  vertex_t* edgelist_majors /* [INOUT] */,
-                  vertex_t* edgelist_minors /* [INOUT] */,
+                  vertex_t* edgelist_srcs /* [INOUT] */,
+                  vertex_t* edgelist_dsts /* [INOUT] */,
                   edge_t num_edgelist_edges,
+                  bool store_transposed,
                   bool do_expensive_check = false);
 
 /**

diff --git a/cpp/include/cugraph/utilities/cython.hpp b/cpp/include/cugraph/utilities/cython.hpp
@@ -598,9 +598,10 @@ std::unique_ptr<major_minor_weights_t<vertex_t, edge_t, weight_t>> call_shuffle(
 template <typename vertex_t, typename edge_t>
 std::unique_ptr<renum_tuple_t<vertex_t, edge_t>> call_renumber(
   raft::handle_t const& handle,
-  vertex_t* shuffled_edgelist_major_vertices /* [INOUT] */,
-  vertex_t* shuffled_edgelist_minor_vertices /* [INOUT] */,
+  vertex_t* shuffled_edgelist_src_vertices /* [INOUT] */,
+  vertex_t* shuffled_edgelist_dst_vertices /* [INOUT] */,
   std::vector<edge_t> const& edge_counts,
+  bool store_transposed,
   bool do_expensive_check,
   bool multi_gpu);
 

diff --git a/cpp/src/structure/create_graph_from_edgelist_impl.cuh b/cpp/src/structure/create_graph_from_edgelist_impl.cuh
@@ -295,21 +295,20 @@ create_graph_from_edgelist_impl(raft::handle_t const& handle,
 
   // 2. renumber
 
-  std::vector<vertex_t*> major_ptrs(col_comm_size);
-  std::vector<vertex_t*> minor_ptrs(major_ptrs.size());
+  std::vector<vertex_t*> src_ptrs(col_comm_size);
+  std::vector<vertex_t*> dst_ptrs(src_ptrs.size());
   for (int i = 0; i < col_comm_size; ++i) {
-    major_ptrs[i] =
-      store_transposed ? edgelist_dst_partitions[i].begin() : edgelist_src_partitions[i].begin();
-    minor_ptrs[i] =
-      store_transposed ? edgelist_src_partitions[i].begin() : edgelist_dst_partitions[i].begin();
+    src_ptrs[i] = edgelist_src_partitions[i].begin();
+    dst_ptrs[i] = edgelist_dst_partitions[i].begin();
   }
   auto [renumber_map_labels, meta] = cugraph::renumber_edgelist<vertex_t, edge_t, multi_gpu>(
     handle,
     std::move(local_vertices),
-    major_ptrs,
-    minor_ptrs,
+    src_ptrs,
+    dst_ptrs,
     edgelist_edge_counts,
-    edgelist_intra_partition_segment_offsets);
+    edgelist_intra_partition_segment_offsets,
+    store_transposed);
 
   // 3. create a graph
 
@@ -367,9 +366,10 @@ create_graph_from_edgelist_impl(raft::handle_t const& handle,
     std::tie(*renumber_map_labels, meta) = cugraph::renumber_edgelist<vertex_t, edge_t, multi_gpu>(
       handle,
       std::move(vertices),
-      store_transposed ? edgelist_cols.data() : edgelist_rows.data(),
-      store_transposed ? edgelist_rows.data() : edgelist_cols.data(),
-      static_cast<edge_t>(edgelist_rows.size()));
+      edgelist_rows.data(),
+      edgelist_cols.data(),
+      static_cast<edge_t>(edgelist_rows.size()),
+      store_transposed);
   }
 
   vertex_t num_vertices{};

diff --git a/cpp/src/structure/renumber_edgelist_impl.cuh b/cpp/src/structure/renumber_edgelist_impl.cuh
@@ -617,12 +617,16 @@ std::enable_if_t<
 renumber_edgelist(
   raft::handle_t const& handle,
   std::optional<rmm::device_uvector<vertex_t>>&& local_vertices,
-  std::vector<vertex_t*> const& edgelist_majors /* [INOUT] */,
-  std::vector<vertex_t*> const& edgelist_minors /* [INOUT] */,
+  std::vector<vertex_t*> const& edgelist_srcs /* [INOUT] */,
+  std::vector<vertex_t*> const& edgelist_dsts /* [INOUT] */,
   std::vector<edge_t> const& edgelist_edge_counts,
   std::optional<std::vector<std::vector<edge_t>>> const& edgelist_intra_partition_segment_offsets,
+  bool store_transposed,
   bool do_expensive_check)
 {
+  auto edgelist_majors = store_transposed ? edgelist_dsts : edgelist_srcs;
+  auto edgelist_minors = store_transposed ? edgelist_srcs : edgelist_dsts;
+
   auto& comm               = handle.get_comms();
   auto const comm_size     = comm.get_size();
   auto const comm_rank     = comm.get_rank();
@@ -870,11 +874,15 @@ std::enable_if_t<
   std::tuple<rmm::device_uvector<vertex_t>, renumber_meta_t<vertex_t, edge_t, multi_gpu>>>
 renumber_edgelist(raft::handle_t const& handle,
                   std::optional<rmm::device_uvector<vertex_t>>&& vertices,
-                  vertex_t* edgelist_majors /* [INOUT] */,
-                  vertex_t* edgelist_minors /* [INOUT] */,
+                  vertex_t* edgelist_srcs /* [INOUT] */,
+                  vertex_t* edgelist_dsts /* [INOUT] */,
                   edge_t num_edgelist_edges,
+                  bool store_transposed,
                   bool do_expensive_check)
 {
+  auto edgelist_majors = store_transposed ? edgelist_dsts : edgelist_srcs;
+  auto edgelist_minors = store_transposed ? edgelist_srcs : edgelist_dsts;
+
   if (do_expensive_check) {
     detail::expensive_check_edgelist<vertex_t, edge_t, multi_gpu>(
       handle,
@@ -885,9 +893,7 @@ renumber_edgelist(raft::handle_t const& handle,
       std::nullopt);
   }
 
-  rmm::device_uvector<vertex_t> renumber_map_labels(0, handle.get_stream());
-  std::vector<vertex_t> segment_offsets{};
-  std::tie(renumber_map_labels, segment_offsets) =
+  auto [renumber_map_labels, segment_offsets] =
     detail::compute_renumber_map<vertex_t, edge_t, multi_gpu>(
       handle,
       std::move(vertices),

diff --git a/cpp/src/structure/renumber_edgelist_mg.cu b/cpp/src/structure/renumber_edgelist_mg.cu
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2021, NVIDIA CORPORATION.
+ * Copyright (c) 2021-2022, NVIDIA CORPORATION.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -23,30 +23,33 @@ template std::tuple<rmm::device_uvector<int32_t>, renumber_meta_t<int32_t, int32
 renumber_edgelist<int32_t, int32_t, true>(
   raft::handle_t const& handle,
   std::optional<rmm::device_uvector<int32_t>>&& local_vertices,
-  std::vector<int32_t*> const& edgelist_majors /* [INOUT] */,
-  std::vector<int32_t*> const& edgelist_minors /* [INOUT] */,
+  std::vector<int32_t*> const& edgelist_srcs /* [INOUT] */,
+  std::vector<int32_t*> const& edgelist_dsts /* [INOUT] */,
   std::vector<int32_t> const& edgelist_edge_counts,
   std::optional<std::vector<std::vector<int32_t>>> const& edgelist_intra_partition_segment_offsets,
+  bool store_transposed,
   bool do_expensive_check);
 
 template std::tuple<rmm::device_uvector<int32_t>, renumber_meta_t<int32_t, int64_t, true>>
 renumber_edgelist<int32_t, int64_t, true>(
   raft::handle_t const& handle,
   std::optional<rmm::device_uvector<int32_t>>&& local_vertices,
-  std::vector<int32_t*> const& edgelist_majors /* [INOUT] */,
-  std::vector<int32_t*> const& edgelist_minors /* [INOUT] */,
+  std::vector<int32_t*> const& edgelist_srcs /* [INOUT] */,
+  std::vector<int32_t*> const& edgelist_dsts /* [INOUT] */,
   std::vector<int64_t> const& edgelist_edge_counts,
   std::optional<std::vector<std::vector<int64_t>>> const& edgelist_intra_partition_segment_offsets,
+  bool store_transposed,
   bool do_expensive_check);
 
 template std::tuple<rmm::device_uvector<int64_t>, renumber_meta_t<int64_t, int64_t, true>>
 renumber_edgelist<int64_t, int64_t, true>(
   raft::handle_t const& handle,
   std::optional<rmm::device_uvector<int64_t>>&& local_vertices,
-  std::vector<int64_t*> const& edgelist_majors /* [INOUT] */,
-  std::vector<int64_t*> const& edgelist_minors /* [INOUT] */,
+  std::vector<int64_t*> const& edgelist_srcs /* [INOUT] */,
+  std::vector<int64_t*> const& edgelist_dsts /* [INOUT] */,
   std::vector<int64_t> const& edgelist_edge_counts,
   std::optional<std::vector<std::vector<int64_t>>> const& edgelist_intra_partition_segment_offsets,
+  bool store_transposed,
   bool do_expensive_check);
 
 }  // namespace cugraph
diff --git a/cpp/src/structure/renumber_edgelist_sg.cu b/cpp/src/structure/renumber_edgelist_sg.cu
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2021, NVIDIA CORPORATION.
+ * Copyright (c) 2021-2022, NVIDIA CORPORATION.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -22,25 +22,28 @@ namespace cugraph {
 template std::tuple<rmm::device_uvector<int32_t>, renumber_meta_t<int32_t, int32_t, false>>
 renumber_edgelist<int32_t, int32_t, false>(raft::handle_t const& handle,
                                            std::optional<rmm::device_uvector<int32_t>>&& vertices,
-                                           int32_t* edgelist_majors /* [INOUT] */,
-                                           int32_t* edgelist_minors /* [INOUT] */,
+                                           int32_t* edgelist_srcs /* [INOUT] */,
+                                           int32_t* edgelist_dsts /* [INOUT] */,
                                            int32_t num_edgelist_edges,
+                                           bool store_transposed,
                                            bool do_expensive_check);
 
 template std::tuple<rmm::device_uvector<int32_t>, renumber_meta_t<int32_t, int64_t, false>>
 renumber_edgelist<int32_t, int64_t, false>(raft::handle_t const& handle,
                                            std::optional<rmm::device_uvector<int32_t>>&& vertices,
-                                           int32_t* edgelist_majors /* [INOUT] */,
-                                           int32_t* edgelist_minors /* [INOUT] */,
+                                           int32_t* edgelist_srcs /* [INOUT] */,
+                                           int32_t* edgelist_dsts /* [INOUT] */,
                                            int64_t num_edgelist_edges,
+                                           bool store_transposed,
                                            bool do_expensive_check);
 
 template std::tuple<rmm::device_uvector<int64_t>, renumber_meta_t<int64_t, int64_t, false>>
 renumber_edgelist<int64_t, int64_t, false>(raft::handle_t const& handle,
                                            std::optional<rmm::device_uvector<int64_t>>&& vertices,
-                                           int64_t* edgelist_majors /* [INOUT] */,
-                                           int64_t* edgelist_minors /* [INOUT] */,
+                                           int64_t* edgelist_srcs /* [INOUT] */,
+                                           int64_t* edgelist_dsts /* [INOUT] */,
                                            int64_t num_edgelist_edges,
+                                           bool store_transposed,
                                            bool do_expensive_check);
 
 }  // namespace cugraph