Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement iterator facade for external sort. #4914

Merged
merged 1 commit into from
Apr 26, 2024

Conversation

lums658
Copy link
Contributor

@lums658 lums658 commented Apr 26, 2024

The iterator_facade class (in tiledb/common/iterator_facade.h) implements, well, an iterator facade. An iterator must provide a fairly rich set of functions to be fully conformant to the C++20 ranges iterator concepts (and to conform to the LegacyIterator named requirements). Providing that functionality involves an enormous amount of boilerplate and can have subtle issues (e.g., with const, with the required type aliases, etc). The iterator_facade class generates all of the boilerplate, based on just a handful of functions that the user has to provide. Use of the iterator_facade class is based on CRTP. A simple iterator only needs to provide three functions in the simplest case to be able to meet all of the requirements of an iterator. For example, we can define an iterator that wraps up a raw pointer and strides through its data by 10 elements at a time.

template <class Value>a
class pointer_wrapper
    : public iterator_facade<pointer_wrapper<Value>> {
  Value *current_ = nullptr;

public:
  pointer_wrapper() = default;
  explicit pointer_wrapper(Value* i)
      : current_(i) {}

  Value &dereference() const noexcept { return *current_; }

  void advance(int off) noexcept { current_ += 10 * off; }

  auto distance_to(pointer_wrapper o) const noexcept {
    return o.current_ - current_;
  }
};

To do the port, I removed all dependencies on other parts of the neo-fun library, primarily by substituting standard library functions, type traits, and concepts for those in the neo-fun library. I also added some functionality to support assignment to an iterator.

I also ported the unit testing from the neo-fun repository. I added a significant number of new tests, primarily to verify that an iterator based on iterator_facade would conform to the expected range and iterator concepts.

Note that there are a few tests that return slightly unexpected results (in that they differ from , e.g., what should be an equivalent iterator from std::vector). Those should have no impact on our immediate intended usage of iterator_facade, but at some point, we should address those issues

  • The iterator_category of some const variants does not appear to be correct
  • operator[] seems to return nonsense in one usage mode in unit_iterator_facade.cc

The implementation of iterator_facade is ported from the implementation at vector-of-bool/neo-fun@b6c38c8, which I cloned on Mar 21, 2024 to use as the beginning of the port. The downloaded code is licensed under the Boost Software License 1.0 (which essentially the same as the MIT license). This iterator_facade is a modernized approach to Boost.Iterator and is greatly simplified via the use of concepts and other modern C++ features (Boost.Iterator dates from the early 2000s). (One of the authors of Boost.Graph is Jeremy Siek, a former PhD student in my group at Indiana University.)

[sc-43633]


TYPE: IMPROVEMENT
DESC: Implement iterator facade for external sort.

@KiterLuc KiterLuc force-pushed the al/iterator-facade/ch43633 branch 3 times, most recently from 48a1dfa to 97c6510 Compare April 26, 2024 17:27
The `iterator_facade` class (in tiledb/common/iterator_facade.h) implements, well, an iterator facade.  An iterator must provide a fairly rich set of functions to be fully conformant to the C++20 ranges iterator concepts (and to conform to the `LegacyIterator` named requirements).  Providing that functionality involves an enormous amount of boilerplate and can have subtle issues (e.g., with `const`, with the required type aliases, etc).  The `iterator_facade` class generates all of the boilerplate, based on just a handful of functions that the user has to provide.  Use of the `iterator_facade` class is based on CRTP.  A simple iterator only needs to provide three functions in the simplest case to be able to meet all of the requirements of an iterator.  For example, we can define an iterator that wraps up a raw pointer and strides through its data by 10 elements at a time.
```
template <class Value>a
class pointer_wrapper
    : public iterator_facade<pointer_wrapper<Value>> {
  Value *current_ = nullptr;

public:
  pointer_wrapper() = default;
  explicit pointer_wrapper(Value* i)
      : current_(i) {}

  Value &dereference() const noexcept { return *current_; }

  void advance(int off) noexcept { current_ += 10 * off; }

  auto distance_to(pointer_wrapper o) const noexcept {
    return o.current_ - current_;
  }
};
```

To do the port, I removed all dependencies on other parts of the `neo-fun` library, primarily by substituting standard library functions, type traits, and concepts for those in the `neo-fun` library.  I also added some functionality to support assignment to an iterator.

I also ported the unit testing from the `neo-fun` repository.  I added a significant number of new tests, primarily to verify that an iterator based on `iterator_facade` would conform to the expected range and iterator concepts.

Note that there are a few tests that return slightly unexpected results (in that they differ from , e.g., what should be an equivalent iterator from `std::vector`).  Those should have no impact on our immediate intended usage of `iterator_facade`, but at some point, we should address those issues
* The `iterator_category` of some const variants does not appear to be correct
* `operator[]` seems to return nonsense in one usage mode in unit_iterator_facade.cc

The implementation of `iterator_facade` is ported from the implementation at vector-of-bool/neo-fun@b6c38c8, which I cloned on Mar 21, 2024 to use as the beginning of the port.  The downloaded code is licensed under the Boost Software License 1.0 (which essentially the same as the MIT license).  This `iterator_facade` is a modernized approach to Boost.Iterator and is greatly simplified via the use of concepts and other modern C++ features (Boost.Iterator dates from the early 2000s).  (One of the authors of Boost.Graph is Jeremy Siek, a former PhD student in my group at Indiana University.)

---
TYPE: IMPROVEMENT
DESC: Implement iterator facade for external sort.
@KiterLuc KiterLuc force-pushed the al/iterator-facade/ch43633 branch from 97c6510 to f534b63 Compare April 26, 2024 17:35
@KiterLuc KiterLuc merged commit c46af50 into dev Apr 26, 2024
58 checks passed
@KiterLuc KiterLuc deleted the al/iterator-facade/ch43633 branch April 26, 2024 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants