Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple the code in the fragment directory from StorageManager. #4453

Merged
merged 13 commits into from
Oct 26, 2023

Conversation

teo-tsirpanis
Copy link
Member

SC-23374

The StorageManager methods that load fragment info and metadata were moved to the FragmentInfo and FragmentMetadata classes. These classes also use ContextResources instead of StorageManager.

I was going to make a fragment object library but it also depends on rest which is not yet componentized.


TYPE: IMPROVEMENT
DESC: Refactor the fragment info loading code to reduce coupling.

@shortcut-integration
Copy link

This pull request has been linked to Shortcut Story #23374: Move StorageManager::load_fragments to Array.

We cannot capture structured binding variables in lambdas.
Copy link
Contributor

@davisp davisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my reading it seems like you've made the following changes:

StorageManager Method Destination
load_array_schemas_and_fragment_metadata FragmentInfo::load_array_schemas_and_fragment_metadata
array_load_fragments Inlined into Array::load_fragments
load_fragment_metadata FragmentMetadata::load
load_consolidated_fragment_meta Free function in fragment_info.cc
load_filtered_fragment_uris Inlined into FragmenInfo::load_array_schemas_and_fragment_metadata

The only thing I'm questioning there is whether load_array_schemas_and_fragment_metadata belongs in FragmentInfo or ArrayDirectory. It seems a bit awkward dropping it into FragmentInfo but it also only feels marginally better if we were to drop that into ArrayDirectory instead.

Maybe @KiterLuc can point us where he think that'd fit better?

Personally I think having all storage access flow through array directory would be best, but looking around that seems like it'll be a long term project once we've finished removing StorageManager.

Beyond that, the only thing that was a definite change was to use resources_ as the member variable name consistently.

All of the other work appears to be the obvious updates that I would expect so other than these two minor questions/changes I'll be ready to +1

tiledb/sm/serialization/query.cc Show resolved Hide resolved
tiledb/sm/fragment/fragment_metadata.h Outdated Show resolved Hide resolved
@teo-tsirpanis
Copy link
Member Author

It seems a bit awkward dropping it into FragmentInfo but it also only feels marginally better if we were to drop that into ArrayDirectory instead.

Hmm, at present ArrayDirectory does not depend on any fragment code. I'm a bit split as well on which is the better place for the method. 🤔

@teo-tsirpanis teo-tsirpanis requested a review from davisp October 26, 2023 12:30
Copy link
Contributor

@davisp davisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point on the ArrayDirectory. I'm fine leaving it in FragmentInfo until we figure out how to rework this stuff later.

resources.stats().start_timer("sm_read_load_consolidated_frag_meta");

// No consolidated fragment metadata file
if (uri.to_string().empty())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change this to throw and make the return values not optional. The only place it is used dereference the returned tile, which will throw anyways.

std::vector<shared_ptr<Tile>> fragment_metadata_tiles(meta_uris.size());
std::vector<std::vector<std::pair<std::string, uint64_t>>> offsets_vectors(
meta_uris.size());
auto status =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move throw_if_not_ok on this line and remove the status.

* @param memory_tracker The memory tracker of the array
* for which the fragment metadata is loaded.
* @param enc_key The encryption key to use.
* @return tuple of Status, latest ArraySchema, map of all array schemas and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove status.

@KiterLuc KiterLuc merged commit e96ea29 into TileDB-Inc:dev Oct 26, 2023
@teo-tsirpanis teo-tsirpanis deleted the fragment-loading-sm branch October 26, 2023 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants