Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tpetra: use created subviews of communication buffers #6598

Merged
merged 2 commits into from
Jan 17, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 12 additions & 4 deletions packages/tpetra/core/src/Tpetra_MultiVector_def.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3395,15 +3395,23 @@ namespace Tpetra {
// be exactly what we need, so we won't have to resize them later.
{
const size_t newSize = X.imports_.extent (0) / numCols;
const size_t offset = j*newSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

j isn't necessarily the correct offset. We should actually use jj (see above) here, in case *this* views a noncontiguous set of columns of another MultiVector.

Actually, if Y views a contiguous proper subset of columns of X, and Z views a noncontiguous proper subset of columns of Y, then jj is not even the right entry of imports_. We can fix that in the "view contiguous subset of columns" constructors and methods by adjusting the offset of imports_ and exports_ accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I should switch j to jj? Note that previously, the subviews took the first newSize entries of the buffer always. If the buffers are strictly used for communication, that might be fine, but I thought indexing into them would be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kddevin wrote:

If the buffers are strictly used for communication, ....

That's correct. They are used for nothing other than communication of packed data.

So I should switch j to jj?

Yes. I actually studied the example in my above comments, and I think this is a complete fix, given how MultiVector::subView* work.

If we switch j to jj, that will fix the case where different single Vectors view distinct columns of one MultiVector. In this case, your change will mean that different Vectors will view different parts of the buffer. Thus, you'll be able to do concurrent asynchronous doExport or doImport correctly with the Vectors.

Consider the following, more complicated example:

using MV = Tpetra::MultiVector<>;
MV X (map, 10);
Teuchos::Range1D Y_colRange (1, 5); // inclusive: 1, 2, 3, 4, 5
RCP<MV> Y = X.subViewNonConst (Y_colRange);

Teuchos::Array<size_t> whichVecsZ {{0, 2, 4}};
Teuchos::Array<size_t> whichVecsW {{0, 2, 4}};
RCP<MV> Z = X.subViewNonConst (whichVecsZ ());
RCP<MV> W = Y.subViewNonConst (whichVecsW ());

Y views columns 1, 2, 3, 4, and 5 of X. Z views columns 0, 2, and 4 of X. W views columns 0, 2, and 4 of Y -- thus, columns 1, 3, and 5 of X. Users would expect it to be safe to do concurrent communication with Z and W.

In this case, they are right, because neither overload of subViewNonConst reuses their parent MultiVector's buffers. Thus, Z and W each would start with zero-length imports_ and exports_ buffers, and would each reallocate them on demand.

Thus, this case is already correct, and your change doesn't break it. If we decide to make subView* reuse communication buffers in the future, we only need to use the correct offset. That will make the above example continue to be correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way we could write a reliable test for this use case, though, would be if we had asynchronous doExport / doImport. We don't have that right now.

auto newImports = X.imports_;
newImports.d_view = subview (X.imports_.d_view, range_type (0, newSize));
newImports.h_view = subview (X.imports_.h_view, range_type (0, newSize));
newImports.d_view = subview (X.imports_.d_view,
range_type (offset, offset+newSize));
newImports.h_view = subview (X.imports_.h_view,
range_type (offset, offset+newSize));
this->imports_ = newImports;
}
{
const size_t newSize = X.exports_.extent (0) / numCols;
const size_t offset = j*newSize;
auto newExports = X.exports_;
newExports.d_view = subview (X.exports_.d_view, range_type (0, newSize));
newExports.h_view = subview (X.exports_.h_view, range_type (0, newSize));
newExports.d_view = subview (X.exports_.d_view,
range_type (offset, offset+newSize));
newExports.h_view = subview (X.exports_.h_view,
range_type (offset, offset+newSize));
this->exports_ = newExports;
}
// These two DualViews already either have the right number of
// entries, or zero entries. This means that we don't need to
Expand Down