-
Notifications
You must be signed in to change notification settings - Fork 47
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement Slice, Head and Tail Operation in both centralize and distr… (
#592) * Implement Slice, Head and Tail Operation in both centralize and distributed environment Signed-off-by: Arup Sarker <[email protected]> * Add Distributed Slice with arrow api Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Implementation of Distributive Slice, Head and Tail Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Revert the changes based on the upstream Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Implement Slice with new Logic Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Implement new logic for slice and add a basic test case Signed-off-by: Arup Sarker <[email protected]> * Update cpp/src/cylon/table.cpp Co-authored-by: niranda perera <[email protected]> * Update cpp/src/cylon/table.cpp Co-authored-by: niranda perera <[email protected]> * Update cpp/src/cylon/table.cpp Co-authored-by: niranda perera <[email protected]> * Update cpp/src/examples/CMakeLists.txt Co-authored-by: niranda perera <[email protected]> * [Cylon] Removed unneccessary data copy and logs Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Fix merge conflict Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Refactoring Slice logic into seperate file Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Update new test cases Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Add multiple test cases for slice operation Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Refactoring slice operation Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Fix error message and un-necessary example files Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Implement tail operation with new logic Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Implement Tail operation by fixing the logic Signed-off-by: Arup Sarker <[email protected]> * Add sliceImple methods and clear unnecessary logs Signed-off-by: Arup Sarker <[email protected]> * [Cylon] Change the output table parameter from address to pointer Signed-off-by: Arup Sarker <[email protected]> * Squashing following commits removing std out fixing errors more logs more logs adding logs attempting to fix macos error cosmetic changes cosmetic changes * Minor fixes (#596) * remove gloo default hostname * minor change gloo * adding gloo-mpi test * adding ucc cyton * Update setup.py * Update setup.py * adding ucc test * adding multi env test * cosmetic changes * adding regular sampling cython * adding UCC barrier * adding macos11 tag for CI * fixing windows error * trying to fix macos ci * trying to fix macos issue * Revert "trying to fix macos issue" This reverts commit cda5c2c. * attempting to fix macos ci * style-check * adding gloo timeout * adding custom mpiexec cmake var * [Cylon] Handle corner case for slice test Signed-off-by: Arup Sarker <[email protected]> * Create README-summit.md (#602) * Create README-summit.md Detailed description of how to compile cylon on summit * Update README-summit.md * Update README-summit.md * Update README-summit.md (#603) add cmake path * adding custom mpirun params cmake var (#604) * adding custom mpirun params cmake var * minor change * changing merge to support empty tables Signed-off-by: Arup Sarker <[email protected]> Co-authored-by: niranda perera <[email protected]> Co-authored-by: Gregor von Laszewski <[email protected]>
- Loading branch information
1 parent
d99a6f2
commit 035fd70
Showing
12 changed files
with
607 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,170 @@ | ||
/* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
#include <memory> | ||
#include <algorithm> | ||
|
||
#include <cylon/table.hpp> | ||
#include <cylon/ctx/arrow_memory_pool_utils.hpp> | ||
#include <cylon/util/macros.hpp> | ||
#include <cylon/util/arrow_utils.hpp> | ||
#include <cylon/scalar.hpp> | ||
|
||
namespace cylon { | ||
|
||
static constexpr int64_t kZero = 0; | ||
|
||
/** | ||
* Slice the part of table to create a single table | ||
* @param table, offset and length | ||
* @return new sliced table | ||
*/ | ||
Status Slice(const std::shared_ptr<Table> &in, int64_t offset, int64_t length, | ||
std::shared_ptr<Table> *out) { | ||
const auto &ctx = in->GetContext(); | ||
const auto &in_table = in->get_table(); | ||
|
||
std::shared_ptr<arrow::Table> out_table; | ||
if (!in->Empty()) { | ||
out_table = in_table->Slice(offset, length); | ||
} else { | ||
out_table = in_table; | ||
} | ||
return Table::FromArrowTable(ctx, std::move(out_table), *out); | ||
} | ||
|
||
/** | ||
* DistributedSlice the part of table to create a single table | ||
* @param table, global_offset and global_length | ||
* @return new sliced table | ||
*/ | ||
Status distributed_slice_impl(const std::shared_ptr<Table> &in, | ||
int64_t global_offset, | ||
int64_t global_length, | ||
int64_t *partition_lengths, | ||
std::shared_ptr<Table> *out) { | ||
const auto &ctx = in->GetContext(); | ||
std::shared_ptr<cylon::Column> partition_len_col; | ||
|
||
if (partition_lengths == nullptr) { | ||
const auto &num_row_scalar = Scalar::Make(arrow::MakeScalar(in->Rows())); | ||
RETURN_CYLON_STATUS_IF_FAILED(ctx->GetCommunicator() | ||
->Allgather(num_row_scalar, &partition_len_col)); | ||
|
||
partition_lengths = | ||
const_cast<int64_t *>(std::static_pointer_cast<arrow::Int64Array>(partition_len_col->data()) | ||
->raw_values()); | ||
} | ||
|
||
int64_t rank = ctx->GetRank(); | ||
int64_t prefix_sum = std::accumulate(partition_lengths, partition_lengths + rank, kZero); | ||
int64_t total_length = std::accumulate(partition_lengths + rank, | ||
partition_lengths + ctx->GetWorldSize(), | ||
prefix_sum); | ||
if (global_offset > total_length) { | ||
return {Code::Invalid, "global offset exceeds total length of the dist table"}; | ||
} | ||
// adjust global length if it exceeds total_length | ||
if (global_offset + global_length > total_length) { | ||
global_length = total_length - global_offset; | ||
} | ||
|
||
int64_t this_length = *(partition_lengths + rank); | ||
assert(this_length == in->Rows()); | ||
|
||
int64_t local_offset = std::max(kZero, std::min(global_offset - prefix_sum, this_length)); | ||
int64_t local_length = | ||
std::min(this_length, std::max(global_offset + global_length - prefix_sum, kZero)) | ||
- local_offset; | ||
|
||
return Slice(in, local_offset, local_length, out); | ||
} | ||
|
||
Status DistributedSlice(const std::shared_ptr<Table> &in, | ||
int64_t offset, | ||
int64_t length, | ||
std::shared_ptr<Table> *out) { | ||
return distributed_slice_impl(in, offset, length, nullptr, out); | ||
} | ||
|
||
/** | ||
* Head the part of table to create a single table with specific number of rows | ||
* @param tables, number of rows | ||
* @return new table | ||
*/ | ||
Status Head(const std::shared_ptr<Table> &table, int64_t num_rows, | ||
std::shared_ptr<Table> *output) { | ||
if (num_rows >= 0) { | ||
return Slice(table, 0, num_rows, output); | ||
} else | ||
return {Code::Invalid, "Number of head rows should be >=0"}; | ||
} | ||
|
||
Status DistributedHead(const std::shared_ptr<Table> &table, int64_t num_rows, | ||
std::shared_ptr<Table> *output) { | ||
|
||
std::shared_ptr<arrow::Table> in_table = table->get_table(); | ||
|
||
if (num_rows >= 0) { | ||
return distributed_slice_impl(table, 0, num_rows, nullptr, output); | ||
} else { | ||
return {Code::Invalid, "Number of head rows should be >=0"}; | ||
} | ||
} | ||
|
||
/** | ||
* Tail the part of table to create a single table with specific number of rows | ||
* @param tables, number of rows | ||
* @return new table | ||
*/ | ||
Status Tail(const std::shared_ptr<Table> &table, int64_t num_rows, | ||
std::shared_ptr<Table> *output) { | ||
|
||
std::shared_ptr<arrow::Table> in_table = table->get_table(); | ||
const int64_t table_size = in_table->num_rows(); | ||
|
||
if (num_rows >= 0) { | ||
return Slice(table, table_size - num_rows, num_rows, output); | ||
} else { | ||
return {Code::Invalid, "Number of tailed rows should be >=0"}; | ||
} | ||
} | ||
|
||
Status DistributedTail(const std::shared_ptr<Table> &table, int64_t num_rows, | ||
std::shared_ptr<Table> *output) { | ||
if (num_rows >= 0) { | ||
const auto &ctx = table->GetContext(); | ||
std::shared_ptr<cylon::Column> partition_len_col; | ||
const auto &num_row_scalar = Scalar::Make(arrow::MakeScalar(table->Rows())); | ||
RETURN_CYLON_STATUS_IF_FAILED(ctx->GetCommunicator() | ||
->Allgather(num_row_scalar, &partition_len_col)); | ||
assert(ctx->GetWorldSize() == partition_len_col->length()); | ||
auto *partition_lengths = | ||
std::static_pointer_cast<arrow::Int64Array>(partition_len_col->data()) | ||
->raw_values(); | ||
|
||
int64_t dist_length = | ||
std::accumulate(partition_lengths, partition_lengths + ctx->GetWorldSize(), kZero); | ||
|
||
return distributed_slice_impl(table, | ||
dist_length - num_rows, | ||
num_rows, | ||
const_cast <int64_t *> (partition_lengths), | ||
output); | ||
} else { | ||
return {Code::Invalid, "Number of tailed rows should be >=0"}; | ||
} | ||
} | ||
|
||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,7 @@ | |
* limitations under the License. | ||
*/ | ||
|
||
#include <cstddef> | ||
#include "utils.hpp" | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.