Skip to content

Commit

Permalink
Implement iterator facade for external sort. (#4914)
Browse files Browse the repository at this point in the history
The `iterator_facade` class (in tiledb/common/iterator_facade.h)
implements, well, an iterator facade. An iterator must provide a fairly
rich set of functions to be fully conformant to the C++20 ranges
iterator concepts (and to conform to the `LegacyIterator` named
requirements). Providing that functionality involves an enormous amount
of boilerplate and can have subtle issues (e.g., with `const`, with the
required type aliases, etc). The `iterator_facade` class generates all
of the boilerplate, based on just a handful of functions that the user
has to provide. Use of the `iterator_facade` class is based on CRTP. A
simple iterator only needs to provide three functions in the simplest
case to be able to meet all of the requirements of an iterator. For
example, we can define an iterator that wraps up a raw pointer and
strides through its data by 10 elements at a time.
```
template <class Value>a
class pointer_wrapper
    : public iterator_facade<pointer_wrapper<Value>> {
  Value *current_ = nullptr;

public:
  pointer_wrapper() = default;
  explicit pointer_wrapper(Value* i)
      : current_(i) {}

  Value &dereference() const noexcept { return *current_; }

  void advance(int off) noexcept { current_ += 10 * off; }

  auto distance_to(pointer_wrapper o) const noexcept {
    return o.current_ - current_;
  }
};
```

To do the port, I removed all dependencies on other parts of the
`neo-fun` library, primarily by substituting standard library functions,
type traits, and concepts for those in the `neo-fun` library. I also
added some functionality to support assignment to an iterator.

I also ported the unit testing from the `neo-fun` repository. I added a
significant number of new tests, primarily to verify that an iterator
based on `iterator_facade` would conform to the expected range and
iterator concepts.

Note that there are a few tests that return slightly unexpected results
(in that they differ from , e.g., what should be an equivalent iterator
from `std::vector`). Those should have no impact on our immediate
intended usage of `iterator_facade`, but at some point, we should
address those issues
* The `iterator_category` of some const variants does not appear to be
correct
* `operator[]` seems to return nonsense in one usage mode in
unit_iterator_facade.cc

The implementation of `iterator_facade` is ported from the
implementation at
vector-of-bool/neo-fun@b6c38c8, which I cloned
on Mar 21, 2024 to use as the beginning of the port. The downloaded code
is licensed under the Boost Software License 1.0 (which essentially the
same as the MIT license). This `iterator_facade` is a modernized
approach to Boost.Iterator and is greatly simplified via the use of
concepts and other modern C++ features (Boost.Iterator dates from the
early 2000s). (One of the authors of Boost.Graph is Jeremy Siek, a
former PhD student in my group at Indiana University.)

---
TYPE: IMPROVEMENT
DESC: Implement iterator facade for external sort.
  • Loading branch information
lums658 authored Apr 26, 2024
1 parent 6302ab0 commit c46af50
Show file tree
Hide file tree
Showing 6 changed files with 1,532 additions and 3 deletions.
2 changes: 1 addition & 1 deletion experimental/tiledb/common/dag/utility/range_join.h
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@
#include <type_traits>
#include <utility>
#include <vector>
#include "arrow_proxy.hpp"
#include "tiledb/common/arrow_proxy.hpp"

#include "external/include/span/span.hpp"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@
* THE SOFTWARE.
*/

#include "../arrow_proxy.hpp"
#include "../bounded_buffer.h"
#include "../print_types.h"
#include "../range_join.h"
#include "../spinlock.h"
#include "../traits.h"
#include "tiledb/common/arrow_proxy.hpp"

int main() {
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
*
* The MIT License
*
* @copyright Copyright (c) 2022 TileDB, Inc.
* @copyright Copyright (c) 2022-2024 TileDB, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -52,4 +52,7 @@ struct arrow_proxy {
}
};

template <typename T>
arrow_proxy(T&&) -> arrow_proxy<T>;

#endif // TILEDB_ARROW_PROXY_HPP
Loading

0 comments on commit c46af50

Please sign in to comment.