forked from TileDB-Inc/TileDB
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ihn/test mingw #1
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ihnorton
force-pushed
the
ihn/test_mingw
branch
from
August 19, 2020 03:10
8224ba2
to
921301b
Compare
ihnorton
pushed a commit
that referenced
this pull request
Sep 2, 2020
This fixes the following within StorageManager::load_array_metadata(): 1. A direct memory leak of a `Buffer` object introduced with the chunked buffers patch. 2. An long-standing indirect memory leak of the metadata buffers. For #1, an auxiliary `Buffer` object is created to store the contents of a read from a chunked buffer. The `Buffer` object is never deleted. For #2, the raw buffers inside of the `Buffer` object are passed into a `ConstBuffer`. However, the `ConstBuffer` does not take ownership or free them. The buffers are never freed while the `ConstBuffer` references them because of leak #1. For this patch, I've replaced the `ConstBuffer` objects with `Buffer` objects so that the `OpenArray` instance can take ownership of the metadata buffers and free them appropriately. Note that before the chunked buffer patch, we had a similar pattern where the `Tile` buffer was disowned, but never freed by the `ConstBuffer`. This may be affecting older core builds.
ihnorton
pushed a commit
that referenced
this pull request
Sep 21, 2020
Currently, reading/unfiltering/copying tiles are performed serially among attribute names within Reader::copy_attribute_values(): ``` for (const auto& name : names) { read_tiles(name, result_tiles); unfilter_tiles(name, result_tiles, &cs_ranges); if (!array_schema_->var_size(name)) copy_fixed_cells(name, stride, result_cell_slabs, &fixed_ctx_cache); else copy_var_cells(name, stride, result_cell_slabs, &var_ctx_cache); clear_tiles(name, result_tiles); } ``` Parallelizing the entire for-loop has a few problems: 1. Each iteration requires a large amount of temporary memory to buffer reads into before they are copied into `buffers_` and cleared within clear_tiles(). This memory can not be shared among parallel iterations, so memory bloat scales linearly with the number of parallel iterations. 2. `read_tiles`, `unfilter_tiles`, `copy_fixed_cells`, `copy_var_cells`, and `clear_tiles` are thread-unsafe because of unprotected access to the individual `ResultTile` elements within `result_tiles`. 3. `copy_fixed_cells` and `copy_var_cells` are thread-unsafe because the `fixed_ctx_cache` and `var_ctx_cache` are thread-unsafe. Problem #1: This patch restricts the number of parallel iterations to 2. This is a hard-coded value that I will justify later in this commit message. Problem #2: We can achieve thread-safety with a mutex for each ResultTile element. Even if we tolerate the memory overhead associated with allocating a mutex for each ResultTile, the CPU time spent locking at this granularity is unacceptably high within the `copy_fixed_cells` and `copy_var_cells` paths. However, it is experimentally negligible for the other paths. This is enough of a penalty that I have decided to use a single lock to protect all elements. This effectively serializes all routines, except for the IO tasks within `read_tiles`. Problem #3: I have made the context caches thread safe, although they are still accessed in serial. This is mostly because I had to write this code before I was able to test that parallelizing `copy_*_cells` was too expensive with per-ResulTile mutexes. I have decided to leave it in this patch because it is zero-harm and required for future improvements. I have been testing this in an IO-bound workload (using an S3 backend). Due to the large variance in S3 request times, I ran my test many times and collected the best-case numbers. // Execution times: Without this patch: 4.5s With this patch, 1 parallel iteration: 4.5s With this patch, 2 parallel iteration: 3.4s With this patch, 3 parallel iteration: 3.4s With this patch, 4 parallel iteration: 3.2s With this patch, 6 parallel iteration: 3.2s With this patch, 8 parallel iteration: 3.3s With this patch, 12 parallel iteration: 3.7s Given the above, it does seem that the number of parallel iterations should be configurable. For more than 2 parallel iterations, we linearly scale memory bloat with little-to-gain in execution time. However, with just 2 parallel iterations, we receive a 32% speed-up. Future Improvments: - Modify ThreadPool::execute() to allow limiting the number of parallel iterations for an arbitrary number of tasks. This patch uses a step-wise execution of parallel reads. In other words: both parallel iterations must complete before the next pair of `read_tiles()` can execute. Ideally, the 3rd iteration should start immediately after one of the first two complete. - Increase the granularity of protection on ResultTiles. I have a couple ideas but they would be large and out of scope for this patch. - Revisit the read path for dimensions. Right now, there is no limit to the number of dimensions read at one time. If we have a large number of dimensions, we will have a memory bottleneck. Co-authored-by: Joe Maley <[email protected]>
ihnorton
pushed a commit
that referenced
this pull request
Dec 11, 2020
The offsets in each buffer correspond to the values in its data buffer. To build a single contigious buffer, we must ensure the offset values continue in ascending order from the previous buffer. For example, consider the following example storing int32 values. Buffer #1: offsets: [0, 8, 16] values: [1, 2, 3, 4, 5, 6, 7] Buffer #2: offsets: [0, 12] values: [100, 200, 300, 400, 500] The final, contigious buffer will be: offsets: [0, 8, 16, 28, 40] values: [1, 2, 3, 4, 5, 6, 7, 100, 200, 300, 400, 500] The last two offsets, `28, 40` were calculated by adding `28` to offsets of Buffer #2: [0, 12]. The `28` was calculated as the byte size of the values in Buffer #1. Co-authored-by: Joe Maley <[email protected]>
ihnorton
pushed a commit
that referenced
this pull request
Jun 9, 2021
Fixes TileDB-Inc#2201 This patch introduces a global lock on the `S3Client` constructor to prevent the following race reported from TSAN: ``` WARNING: ThreadSanitizer: data race (pid=12) Write of size 8 at 0x7fe93658a1b0 by thread T16 (mutexes: write M144250874682678808, write M150443169551677688): #0 cJSON_ParseWithOpts external/aws/aws-cpp-sdk-core/source/external/cjson/cJSON.cpp:1006 (libexternal_Saws_Slibaws.so+0x299bef) #1 Aws::Utils::Json::JsonValue::JsonValue(std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&) external/aws/aws-cpp-sdk-core/source/utils/json/JsonSerializer.cpp:39 (libexternal_Saws_Slibaws.so+0x32d09c) #2 Aws::Config::EC2InstanceProfileConfigLoader::LoadInternal() external/aws/aws-cpp-sdk-core/source/config/AWSProfileConfigLoader.cpp:341 (libexternal_Saws_Slibaws.so+0x286d88) #3 Aws::Config::AWSProfileConfigLoader::Load() external/aws/aws-cpp-sdk-core/source/config/AWSProfileConfigLoader.cpp:47 (libexternal_Saws_Slibaws.so+0x282b4f) #4 Aws::Auth::InstanceProfileCredentialsProvider::Reload() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.cpp:265 (libexternal_Saws_Slibaws.so+0x235987) #5 Aws::Auth::InstanceProfileCredentialsProvider::RefreshIfExpired() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.cpp:283 (libexternal_Saws_Slibaws.so+0x23613b) #6 Aws::Auth::InstanceProfileCredentialsProvider::GetAWSCredentials() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.cpp:250 (libexternal_Saws_Slibaws.so+0x23be9a) TileDB-Inc#7 Aws::Auth::AWSCredentialsProviderChain::GetAWSCredentials() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProviderChain.cpp:35 (libexternal_Saws_Slibaws.so+0x244550) TileDB-Inc#8 Aws::Client::AWSAuthV4Signer::AWSAuthV4Signer(std::shared_ptr<Aws::Auth::AWSCredentialsProvider> const&, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool) external/aws/aws-cpp-sdk-core/source/auth/AWSAuthSigner.cpp:170 (libexternal_Saws_Slibaws.so+0x2262ed) TileDB-Inc#9 std::shared_ptr<Aws::Client::AWSAuthV4Signer> Aws::MakeShared<Aws::Client::AWSAuthV4Signer, std::shared_ptr<Aws::Auth::DefaultAWSCredentialsProviderChain>, char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&, bool>(char const*, std::shared_ptr<Aws::Auth::DefaultAWSCredentialsProviderChain>&&, char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&, bool&&) /usr/include/c++/9/ext/new_allocator.h:147 (libexternal_Saws_Slibaws.so+0x39a78b) TileDB-Inc#10 Aws::S3::S3Client::S3Client(Aws::Client::ClientConfiguration const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool, Aws::S3::US_EAST_1_REGIONAL_ENDPOINT_OPTION) external/aws/aws-cpp-sdk-s3/source/S3Client.cpp:134 (libexternal_Saws_Slibaws.so+0x39a78b) TileDB-Inc#11 std::shared_ptr<Aws::S3::S3Client>::shared_ptr<Aws::Allocator<Aws::S3::S3Client>, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool const&>(std::_Sp_alloc_shared_tag<Aws::Allocator<Aws::S3::S3Client> >, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&&, bool const&) /usr/include/c++/9/ext/new_allocator.h:147 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd) TileDB-Inc#12 std::shared_ptr<Aws::S3::S3Client> std::allocate_shared<Aws::S3::S3Client, Aws::Allocator<Aws::S3::S3Client>, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool const&>(Aws::Allocator<Aws::S3::S3Client> const&, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&&, bool const&) /usr/include/c++/9/bits/shared_ptr.h:702 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd) TileDB-Inc#13 std::shared_ptr<Aws::S3::S3Client> Aws::MakeShared<Aws::S3::S3Client, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool const&>(char const*, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&&, bool const&) external/aws/aws-cpp-sdk-core/include/aws/core/utils/memory/stl/AWSAllocator.h:106 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd) TileDB-Inc#14 tiledb::sm::S3::init_client() const external/com_github_tiledb_tiledb/tiledb/sm/filesystem/s3.cc:1104 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd) ``` --- TYPE: BUG DESC: Fix race within `S3::init_client` Co-authored-by: Joe Maley <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.