Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added bzip2 compression #2

Merged
merged 4 commits into from
Apr 30, 2017
Merged

Added bzip2 compression #2

merged 4 commits into from
Apr 30, 2017

Conversation

TileDB-Inc
Copy link
Collaborator

No description provided.

@TileDB-Inc TileDB-Inc merged commit 83c53e0 into dev Apr 30, 2017
@jakebolewski jakebolewski deleted the add_bzip2 branch July 11, 2018 15:26
joe-maley pushed a commit that referenced this pull request Nov 26, 2019
This fixes two issues:

Issue #1:
"Cannot deserialize; kj::Exception: expected segment != nullptr &&
 segment->checkObject(segment->getStartPtr(), ONE * WORDS); Message
 did not contain a root pointer."}

This a one-line fix where we must reset the original buffers offset
after writing to it.

Issue #2:
Before attempting to deserialize a query, we serialize the current
state as a backup. If the deserialization of the new query fails,
we deserialize the backup to ensure that the query object leaves
the 'query_deserialize()' in its original state. Deserializing
the backup is failing on a 'memcpy' when we attempt to copy the
first incoming attribute buffer into the backup query. This fails
because we are only deserializing the query object and do not
provide any concatenated data buffers.

The easiest fix is to add a path to ignore attribute buffer processing
entirely in the 'do_query_deserialize' path. This allows us to revert
the Query object to its original state, leaving the data buffers untouched.
This is OK because we also revert the 'copy_state', so we can have extra,
incomplete data at the end of the data buffers.
joe-maley pushed a commit that referenced this pull request Nov 27, 2019
This fixes two issues:

Issue #1:
"Cannot deserialize; kj::Exception: expected segment != nullptr &&
 segment->checkObject(segment->getStartPtr(), ONE * WORDS); Message
 did not contain a root pointer."}

This a one-line fix where we must reset the original buffers offset
after writing to it.

Issue #2:
Before attempting to deserialize a query, we serialize the current
state as a backup. If the deserialization of the new query fails,
we deserialize the backup to ensure that the query object leaves
the 'query_deserialize()' in its original state. Deserializing
the backup is failing on a 'memcpy' when we attempt to copy the
first incoming attribute buffer into the backup query. This fails
because we are only deserializing the query object and do not
provide any concatenated data buffers.

The easiest fix is to add a path to ignore attribute buffer processing
entirely in the 'do_query_deserialize' path. This allows us to revert
the Query object to its original state, leaving the data buffers untouched.
This is OK because we also revert the 'copy_state', so we can have extra,
incomplete data at the end of the data buffers.
joe-maley pushed a commit that referenced this pull request Nov 27, 2019
This fixes two issues:

Issue #1:
"Cannot deserialize; kj::Exception: expected segment != nullptr &&
 segment->checkObject(segment->getStartPtr(), ONE * WORDS); Message
 did not contain a root pointer."}

This a one-line fix where we must reset the original buffers offset
after writing to it.

Issue #2:
Before attempting to deserialize a query, we serialize the current
state as a backup. If the deserialization of the new query fails,
we deserialize the backup to ensure that the query object leaves
the 'query_deserialize()' in its original state. Deserializing
the backup is failing on a 'memcpy' when we attempt to copy the
first incoming attribute buffer into the backup query. This fails
because we are only deserializing the query object and do not
provide any concatenated data buffers.

The easiest fix is to add a path to ignore attribute buffer processing
entirely in the 'do_query_deserialize' path. This allows us to revert
the Query object to its original state, leaving the data buffers untouched.
This is OK because we also revert the 'copy_state', so we can have extra,
incomplete data at the end of the data buffers.
joe-maley pushed a commit that referenced this pull request Dec 2, 2019
This fixes two issues:

Issue #1:
"Cannot deserialize; kj::Exception: expected segment != nullptr &&
 segment->checkObject(segment->getStartPtr(), ONE * WORDS); Message
 did not contain a root pointer."}

This a one-line fix where we must reset the original buffers offset
after writing to it.

Issue #2:
Before attempting to deserialize a query, we serialize the current
state as a backup. If the deserialization of the new query fails,
we deserialize the backup to ensure that the query object leaves
the 'query_deserialize()' in its original state. Deserializing
the backup is failing on a 'memcpy' when we attempt to copy the
first incoming attribute buffer into the backup query. This fails
because we are only deserializing the query object and do not
provide any concatenated data buffers.

The easiest fix is to add a path to ignore attribute buffer processing
entirely in the 'do_query_deserialize' path. This allows us to revert
the Query object to its original state, leaving the data buffers untouched.
This is OK because we also revert the 'copy_state', so we can have extra,
incomplete data at the end of the data buffers.
ihnorton pushed a commit that referenced this pull request Dec 4, 2019
This fixes two issues:

Issue #1:
"Cannot deserialize; kj::Exception: expected segment != nullptr &&
 segment->checkObject(segment->getStartPtr(), ONE * WORDS); Message
 did not contain a root pointer."}

This a one-line fix where we must reset the original buffers offset
after writing to it.

Issue #2:
Before attempting to deserialize a query, we serialize the current
state as a backup. If the deserialization of the new query fails,
we deserialize the backup to ensure that the query object leaves
the 'query_deserialize()' in its original state. Deserializing
the backup is failing on a 'memcpy' when we attempt to copy the
first incoming attribute buffer into the backup query. This fails
because we are only deserializing the query object and do not
provide any concatenated data buffers.

The easiest fix is to add a path to ignore attribute buffer processing
entirely in the 'do_query_deserialize' path. This allows us to revert
the Query object to its original state, leaving the data buffers untouched.
This is OK because we also revert the 'copy_state', so we can have extra,
incomplete data at the end of the data buffers.

(cherry picked from commit 8fe1427
  #1444)
joe-maley pushed a commit that referenced this pull request Dec 20, 2019
For early review: ChunkBuffers and associated unit tests.

This patch contains chunk_buffers.cc/h for use in selective
decompression. This also contains a passing unit-ChunkBuffers.cc test
and a modified unit-Tile.cc test that passes with the associated
changes to Tile that are NOT included in this review.

I have included a detailed explanation of the new class in its
header file (chunk_buffers.h), but for easy reference, I'll paste it here:

 * The ChunkBuffers class is used to represent a logically contigious buffer
 * as a vector of individual buffers. These individual buffers are referred to
 * as "chunk buffers". Each chunk buffer may be allocated individually, which
 * will save memory in scenarios where the logically contigious buffer is
 * sparsley allocated.
 *
 * After construction, the class must be initialized before performing IO. The
 * initialization determines the following, independent usage paradigms:
 *
 * #1: Chunk Sizes: Fixed/Variable
 * The chunk sizes must be either fixed or variable. An instance with fixed
 * chunk sizes ensures that all chunk buffers are of equal size. The size of the
 * last chunk buffer may be equal-to or less-than the other chunk sizes.
 * Instances with fixed size chunks have a smaller memory footprint and have a
 * smaller algorithmic complexity when performing IO. For variable sized chunks,
 * each chunk size is independent from the others.
 *
 * #2: Chunk Buffer Addressing: Discrete/Contigious
 * The addresses of the individual chunk buffers may or may not be virtually
 * contigious. For example, the chunk addresses within a virtually contigious
 * instance may be allocated at address 1024 and 1028, where the first chunk is
 * of size 4. Non-contigious chunks (referred to as "discrete") may be allocated
 * at any address. The trade-off is that the memory of each discrete chunk is
 * managed individually, where contigious chunk buffers can be managed by the
 * first chunk alone.
 *
 * #3: Memory Management: Internal/External
 * The chunk buffers may be allocated and freed internally or externally.
 * Internal memory management is exposed through the alloc_*() and free_*()
 * routines. External memory management is exposed through the set_*() routines.
 * Currently, this only supports external memory management for contigiously
 * addressed buffers and internal memory management for discretely addressed
 * buffers.
 *
 * Note that the ChunkBuffers class does NOT support any concept of ownership.
 * It is up to the caller to free the instance before destruction.
@joe-maley joe-maley mentioned this pull request Dec 20, 2019
ihnorton referenced this pull request in ihnorton/TileDB Mar 17, 2020
Print error message for is_object !success outcome
joe-maley pushed a commit that referenced this pull request Mar 17, 2020
Change #1: Fix non-multipart uploads

  This fixes an issue where we free the buffer before writing it in the
  non-multipart upload path, which can be used for quicker small writes.

  Adds an associated unit test.

Change #2: Missing error return path in blob flush

  In the blob flush path for multipart uploads, the last flush of the write
  cache may silently fail. This ensures the user receives a non-OK status
  from flush_blob() in this scenario.

Change #3: Multipart uploads fail when they have > 10 chunks

  In a the multipart upload (aka blocklist upload), each chunk is uploaded
  a block with a string-id unique to the overall object (aka blob). There is
  an undocumented requirement that all string-ids must be of equal length.

  Currently, the string-ids are generated from an incrementing integer. For
  example, the id of the first chunk is "0", the second chunk is "1", the
  tenth chunk is "9", and the eleventh chunk is "10". The eleventh chunk
  has a string-id size of 2 while all the others have a size of 1. The
  eleventh chunk (and all other future chunks) fail with the following
  error message:

    "The specified blob or block content is invalid"

  This patches pads the string-ids to ensure they are all 5-characters long.
  The maximum number of chunks is 50,000 so 5-characters is sufficient to
  contain all chunk ids.

  More info here:
  https://gauravmantri.com/2013/05/18/windows-azure-blob-storage-dealing-with-the-specified-blob-or-block-content-is-invalid-error/
@joe-maley joe-maley mentioned this pull request Mar 17, 2020
joe-maley pushed a commit that referenced this pull request Mar 17, 2020
Change #1: Fix non-multipart uploads

  This fixes an issue where we free the buffer before writing it in the
  non-multipart upload path, which can be used for quicker small writes.

  Adds an associated unit test.

Change #2: Missing error return path in blob flush

  In the blob flush path for multipart uploads, the last flush of the write
  cache may silently fail. This ensures the user receives a non-OK status
  from flush_blob() in this scenario.

Change #3: Multipart uploads fail when they have > 10 chunks

  In a the multipart upload (aka blocklist upload), each chunk is uploaded
  a block with a string-id unique to the overall object (aka blob). There is
  an undocumented requirement that all string-ids must be of equal length.

  Currently, the string-ids are generated from an incrementing integer. For
  example, the id of the first chunk is "0", the second chunk is "1", the
  tenth chunk is "9", and the eleventh chunk is "10". The eleventh chunk
  has a string-id size of 2 while all the others have a size of 1. The
  eleventh chunk (and all other future chunks) fail with the following
  error message:

    "The specified blob or block content is invalid"

  This patches pads the string-ids to ensure they are all 5-characters long.
  The maximum number of chunks is 50,000 so 5-characters is sufficient to
  contain all chunk ids.

  More info here:
  https://gauravmantri.com/2013/05/18/windows-azure-blob-storage-dealing-with-the-specified-blob-or-block-content-is-invalid-error/
joe-maley pushed a commit that referenced this pull request Mar 17, 2020
Change #1: Fix non-multipart uploads

  This fixes an issue where we free the buffer before writing it in the
  non-multipart upload path, which can be used for quicker small writes.

  Adds an associated unit test.

Change #2: Missing error return path in blob flush

  In the blob flush path for multipart uploads, the last flush of the write
  cache may silently fail. This ensures the user receives a non-OK status
  from flush_blob() in this scenario.

Change #3: Multipart uploads fail when they have > 10 chunks

  In a the multipart upload (aka blocklist upload), each chunk is uploaded
  a block with a string-id unique to the overall object (aka blob). There is
  an undocumented requirement that all string-ids must be of equal length.

  Currently, the string-ids are generated from an incrementing integer. For
  example, the id of the first chunk is "0", the second chunk is "1", the
  tenth chunk is "9", and the eleventh chunk is "10". The eleventh chunk
  has a string-id size of 2 while all the others have a size of 1. The
  eleventh chunk (and all other future chunks) fail with the following
  error message:

    "The specified blob or block content is invalid"

  This patches pads the string-ids to ensure they are all 5-characters long.
  The maximum number of chunks is 50,000 so 5-characters is sufficient to
  contain all chunk ids.

  More info here:
  https://gauravmantri.com/2013/05/18/windows-azure-blob-storage-dealing-with-the-specified-blob-or-block-content-is-invalid-error/
joe-maley pushed a commit that referenced this pull request Jun 16, 2020
This fixes the following within StorageManager::load_array_metadata():
1. A direct memory leak of a `Buffer` object introduced with the chunked buffers
  patch.
2. An long-standing indirect memory leak of the metadata buffers.

For #1, an auxiliary `Buffer` object is created to store the contents of a
read from a chunked buffer. The `Buffer` object is never deleted.

For #2, the raw buffers inside of the `Buffer` object are passed into a
`ConstBuffer`. However, the `ConstBuffer` does not take ownership or free them.
The buffers are never freed while the `ConstBuffer` references them because of
leak #1. For this patch, I've replaced the `ConstBuffer` objects with `Buffer`
objects so that the `OpenArray` instance can take ownership of the metadata
buffers and free them appropriately.

Note that before the chunked buffer patch, we had a similar pattern where the
`Tile` buffer was disowned, but never freed by the `ConstBuffer`. This may be
affecting older core builds.
joe-maley pushed a commit that referenced this pull request Jun 16, 2020
This fixes the following within StorageManager::load_array_metadata():
1. A direct memory leak of a `Buffer` object introduced with the chunked buffers
  patch.
2. An long-standing indirect memory leak of the metadata buffers.

For #1, an auxiliary `Buffer` object is created to store the contents of a
read from a chunked buffer. The `Buffer` object is never deleted.

For #2, the raw buffers inside of the `Buffer` object are passed into a
`ConstBuffer`. However, the `ConstBuffer` does not take ownership or free them.
The buffers are never freed while the `ConstBuffer` references them because of
leak #1. For this patch, I've replaced the `ConstBuffer` objects with `Buffer`
objects so that the `OpenArray` instance can take ownership of the metadata
buffers and free them appropriately.

Note that before the chunked buffer patch, we had a similar pattern where the
`Tile` buffer was disowned, but never freed by the `ConstBuffer`. This may be
affecting older core builds.
joe-maley pushed a commit that referenced this pull request Jun 16, 2020
This fixes the following within StorageManager::load_array_metadata():
1. A direct memory leak of a `Buffer` object introduced with the chunked buffers
  patch.
2. An long-standing indirect memory leak of the metadata buffers.

For #1, an auxiliary `Buffer` object is created to store the contents of a
read from a chunked buffer. The `Buffer` object is never deleted.

For #2, the raw buffers inside of the `Buffer` object are passed into a
`ConstBuffer`. However, the `ConstBuffer` does not take ownership or free them.
The buffers are never freed while the `ConstBuffer` references them because of
leak #1. For this patch, I've replaced the `ConstBuffer` objects with `Buffer`
objects so that the `OpenArray` instance can take ownership of the metadata
buffers and free them appropriately.

Note that before the chunked buffer patch, we had a similar pattern where the
`Tile` buffer was disowned, but never freed by the `ConstBuffer`. This may be
affecting older core builds.
joe-maley pushed a commit that referenced this pull request Jun 17, 2020
This fixes the following within StorageManager::load_array_metadata():
1. A direct memory leak of a `Buffer` object introduced with the chunked buffers
  patch.
2. An long-standing indirect memory leak of the metadata buffers.

For #1, an auxiliary `Buffer` object is created to store the contents of a
read from a chunked buffer. The `Buffer` object is never deleted.

For #2, the raw buffers inside of the `Buffer` object are passed into a
`ConstBuffer`. However, the `ConstBuffer` does not take ownership or free them.
The buffers are never freed while the `ConstBuffer` references them because of
leak #1. For this patch, I've replaced the `ConstBuffer` objects with `Buffer`
objects so that the `OpenArray` instance can take ownership of the metadata
buffers and free them appropriately.

Note that before the chunked buffer patch, we had a similar pattern where the
`Tile` buffer was disowned, but never freed by the `ConstBuffer`. This may be
affecting older core builds.
joe-maley pushed a commit that referenced this pull request Jun 17, 2020
This fixes the following within StorageManager::load_array_metadata():
1. A direct memory leak of a `Buffer` object introduced with the chunked buffers
  patch.
2. An long-standing indirect memory leak of the metadata buffers.

For #1, an auxiliary `Buffer` object is created to store the contents of a
read from a chunked buffer. The `Buffer` object is never deleted.

For #2, the raw buffers inside of the `Buffer` object are passed into a
`ConstBuffer`. However, the `ConstBuffer` does not take ownership or free them.
The buffers are never freed while the `ConstBuffer` references them because of
leak #1. For this patch, I've replaced the `ConstBuffer` objects with `Buffer`
objects so that the `OpenArray` instance can take ownership of the metadata
buffers and free them appropriately.

Note that before the chunked buffer patch, we had a similar pattern where the
`Tile` buffer was disowned, but never freed by the `ConstBuffer`. This may be
affecting older core builds.
joe-maley pushed a commit that referenced this pull request Jul 22, 2020
This fixes the following within StorageManager::load_array_metadata():
1. A direct memory leak of a `Buffer` object introduced with the chunked buffers
  patch.
2. An long-standing indirect memory leak of the metadata buffers.

For #1, an auxiliary `Buffer` object is created to store the contents of a
read from a chunked buffer. The `Buffer` object is never deleted.

For #2, the raw buffers inside of the `Buffer` object are passed into a
`ConstBuffer`. However, the `ConstBuffer` does not take ownership or free them.
The buffers are never freed while the `ConstBuffer` references them because of
leak #1. For this patch, I've replaced the `ConstBuffer` objects with `Buffer`
objects so that the `OpenArray` instance can take ownership of the metadata
buffers and free them appropriately.

Note that before the chunked buffer patch, we had a similar pattern where the
`Tile` buffer was disowned, but never freed by the `ConstBuffer`. This may be
affecting older core builds.
Shelnutt2 pushed a commit that referenced this pull request Jul 22, 2020
This fixes the following within StorageManager::load_array_metadata():
1. A direct memory leak of a `Buffer` object introduced with the chunked buffers
  patch.
2. An long-standing indirect memory leak of the metadata buffers.

For #1, an auxiliary `Buffer` object is created to store the contents of a
read from a chunked buffer. The `Buffer` object is never deleted.

For #2, the raw buffers inside of the `Buffer` object are passed into a
`ConstBuffer`. However, the `ConstBuffer` does not take ownership or free them.
The buffers are never freed while the `ConstBuffer` references them because of
leak #1. For this patch, I've replaced the `ConstBuffer` objects with `Buffer`
objects so that the `OpenArray` instance can take ownership of the metadata
buffers and free them appropriately.

Note that before the chunked buffer patch, we had a similar pattern where the
`Tile` buffer was disowned, but never freed by the `ConstBuffer`. This may be
affecting older core builds.
joe-maley pushed a commit that referenced this pull request Aug 7, 2020
This introduces a global thread pool for use when TBB is disabled. The
performance has not been exhaustively benchmarked against TBB. However, I did
test this on two readily available scenarios that I had been recently performance
benchmarking for other reasons. One scenario makes heavy use of the parallel_sort
path while the other does not. Surprisingly, disabling TBB performs about 10%
quicker with this pach.

// Scenario #1
TBB: 3.4s
TBB disabled, this patch: 3.0s
TBB disabled, on dev: 10.0s

// Scenario #2
TBB: 3.1
TBB disabled, this patch: 2.7s
TBB disabled, on dev: 9.1s

For now, this patch uses the threadpool at the same scope as the TBB scheduler.
It is a global thread pool, shared among a single process, and conditionally
compiled. The concurrency level that the thread pool is configured with is
determined from the "sm.num_tbb_threads" config.

This patch does not disable TBB by default.
joe-maley pushed a commit that referenced this pull request Nov 3, 2020
This fixes a bug in the following scenario:
1. Read an array with an older format version
2. Write an a new array, using the array schema from the previous step.
3. Read the new array.

Note that our serialization path does not check the version. We do not support
writing array schemas as if they were being written from an earlier version
of the core.

In step #2, we serialize the array schema as if it was the latest version but
set the "version" attribute to an older version.
In step #3, we read the "version" attribute and deserialize that format. The
format is a newer format, so the deserialization fails.

---

There are a few options to fix the bug:
1. Always write the "version" as the latest version. This ensures future reads
can correctly deserialize the on-disk format.

2. Treat all deserialized objects (e.g. in-memory ArraySchema objects) as if
they were the latest version.

3. Modify the serialization path for backwards compatibility.

Solution 3 is a lot of work, and I'm not sure if we even need/want to support
that use-case. Solution 2 is my preferred fix, but there may be a future use-case
where we want to know that version of an ArraySchema after it has been deserialized.
This patch implements solution 1.
joe-maley pushed a commit that referenced this pull request Nov 4, 2020
This fixes a bug in the following scenario:
1. Read an array with an older format version
2. Write an a new array, using the array schema from the previous step.
3. Read the new array.

Note that our serialization path does not check the version. We do not support
writing array schemas as if they were being written from an earlier version
of the core.

In step #2, we serialize the array schema as if it was the latest version but
set the "version" attribute to an older version.
In step #3, we read the "version" attribute and deserialize that format. The
format is a newer format, so the deserialization fails.

---

There are a few options to fix the bug:
1. Always write the "version" as the latest version. This ensures future reads
can correctly deserialize the on-disk format.

2. Treat all deserialized objects (e.g. in-memory ArraySchema objects) as if
they were the latest version.

3. Modify the serialization path for backwards compatibility.

Solution 3 is a lot of work, and I'm not sure if we even need/want to support
that use-case. Solution 2 is my preferred fix, but there may be a future use-case
where we want to know that version of an ArraySchema after it has been deserialized.
This patch implements solution 1.

Co-authored-by: Joe Maley <[email protected]>
github-actions bot pushed a commit that referenced this pull request Nov 4, 2020
This fixes a bug in the following scenario:
1. Read an array with an older format version
2. Write an a new array, using the array schema from the previous step.
3. Read the new array.

Note that our serialization path does not check the version. We do not support
writing array schemas as if they were being written from an earlier version
of the core.

In step #2, we serialize the array schema as if it was the latest version but
set the "version" attribute to an older version.
In step #3, we read the "version" attribute and deserialize that format. The
format is a newer format, so the deserialization fails.

---

There are a few options to fix the bug:
1. Always write the "version" as the latest version. This ensures future reads
can correctly deserialize the on-disk format.

2. Treat all deserialized objects (e.g. in-memory ArraySchema objects) as if
they were the latest version.

3. Modify the serialization path for backwards compatibility.

Solution 3 is a lot of work, and I'm not sure if we even need/want to support
that use-case. Solution 2 is my preferred fix, but there may be a future use-case
where we want to know that version of an ArraySchema after it has been deserialized.
This patch implements solution 1.

Co-authored-by: Joe Maley <[email protected]>
joe-maley pushed a commit that referenced this pull request Nov 6, 2020
…1893)

This fixes a bug in the following scenario:
1. Read an array with an older format version
2. Write an a new array, using the array schema from the previous step.
3. Read the new array.

Note that our serialization path does not check the version. We do not support
writing array schemas as if they were being written from an earlier version
of the core.

In step #2, we serialize the array schema as if it was the latest version but
set the "version" attribute to an older version.
In step #3, we read the "version" attribute and deserialize that format. The
format is a newer format, so the deserialization fails.

---

There are a few options to fix the bug:
1. Always write the "version" as the latest version. This ensures future reads
can correctly deserialize the on-disk format.

2. Treat all deserialized objects (e.g. in-memory ArraySchema objects) as if
they were the latest version.

3. Modify the serialization path for backwards compatibility.

Solution 3 is a lot of work, and I'm not sure if we even need/want to support
that use-case. Solution 2 is my preferred fix, but there may be a future use-case
where we want to know that version of an ArraySchema after it has been deserialized.
This patch implements solution 1.

Co-authored-by: Joe Maley <[email protected]>

Co-authored-by: joe maley <[email protected]>
Co-authored-by: Joe Maley <[email protected]>
joe-maley pushed a commit that referenced this pull request Nov 13, 2020
The offsets in each buffer correspond to the values in its
data buffer. To build a single contigious buffer, we must
ensure the offset values continue in ascending order from the
previous buffer. For example, consider the following example
storing int32 values.

Buffer #1:
  offsets: [0, 8, 16]
  values:  [1, 2, 3, 4, 5, 6, 7]

Buffer #2:
  offsets: [0, 12]
  values:  [100, 200, 300, 400, 500]

The final, contigious buffer will be:
  offsets: [0, 8, 16, 28, 40]
  values:  [1, 2, 3, 4, 5, 6, 7, 100, 200, 300, 400, 500]

The last two offsets, `28, 40` were calculated by adding `28`
to offsets of Buffer #2: [0, 12]. The `28` was calculated as
the byte size of the values in Buffer #1.
joe-maley pushed a commit that referenced this pull request Nov 16, 2020
The offsets in each buffer correspond to the values in its
data buffer. To build a single contigious buffer, we must
ensure the offset values continue in ascending order from the
previous buffer. For example, consider the following example
storing int32 values.

Buffer #1:
  offsets: [0, 8, 16]
  values:  [1, 2, 3, 4, 5, 6, 7]

Buffer #2:
  offsets: [0, 12]
  values:  [100, 200, 300, 400, 500]

The final, contigious buffer will be:
  offsets: [0, 8, 16, 28, 40]
  values:  [1, 2, 3, 4, 5, 6, 7, 100, 200, 300, 400, 500]

The last two offsets, `28, 40` were calculated by adding `28`
to offsets of Buffer #2: [0, 12]. The `28` was calculated as
the byte size of the values in Buffer #1.

Co-authored-by: Joe Maley <[email protected]>
github-actions bot pushed a commit that referenced this pull request Nov 16, 2020
The offsets in each buffer correspond to the values in its
data buffer. To build a single contigious buffer, we must
ensure the offset values continue in ascending order from the
previous buffer. For example, consider the following example
storing int32 values.

Buffer #1:
  offsets: [0, 8, 16]
  values:  [1, 2, 3, 4, 5, 6, 7]

Buffer #2:
  offsets: [0, 12]
  values:  [100, 200, 300, 400, 500]

The final, contigious buffer will be:
  offsets: [0, 8, 16, 28, 40]
  values:  [1, 2, 3, 4, 5, 6, 7, 100, 200, 300, 400, 500]

The last two offsets, `28, 40` were calculated by adding `28`
to offsets of Buffer #2: [0, 12]. The `28` was calculated as
the byte size of the values in Buffer #1.

Co-authored-by: Joe Maley <[email protected]>
joe-maley pushed a commit that referenced this pull request Nov 17, 2020
The offsets in each buffer correspond to the values in its
data buffer. To build a single contigious buffer, we must
ensure the offset values continue in ascending order from the
previous buffer. For example, consider the following example
storing int32 values.

Buffer #1:
  offsets: [0, 8, 16]
  values:  [1, 2, 3, 4, 5, 6, 7]

Buffer #2:
  offsets: [0, 12]
  values:  [100, 200, 300, 400, 500]

The final, contigious buffer will be:
  offsets: [0, 8, 16, 28, 40]
  values:  [1, 2, 3, 4, 5, 6, 7, 100, 200, 300, 400, 500]

The last two offsets, `28, 40` were calculated by adding `28`
to offsets of Buffer #2: [0, 12]. The `28` was calculated as
the byte size of the values in Buffer #1.

Co-authored-by: Joe Maley <[email protected]>

Co-authored-by: joe maley <[email protected]>
Co-authored-by: Joe Maley <[email protected]>
joe-maley pushed a commit that referenced this pull request Dec 17, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
joe-maley pushed a commit that referenced this pull request Dec 17, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
joe-maley pushed a commit that referenced this pull request Dec 17, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
joe-maley pushed a commit that referenced this pull request Dec 18, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
joe-maley pushed a commit that referenced this pull request Dec 18, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
Shelnutt2 pushed a commit that referenced this pull request Dec 18, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
github-actions bot pushed a commit that referenced this pull request Dec 18, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
Shelnutt2 pushed a commit that referenced this pull request Dec 18, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
Shelnutt2 pushed a commit that referenced this pull request Dec 18, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
Shelnutt2 pushed a commit that referenced this pull request Dec 27, 2020
This makes three optimizations to `ResultTile::compute_results_sparse<char>`:

1. When executing on a dim_idx > 0, values in `r_bitmap` may be zero. In this
scenario, we can skip checking the associated coordinate for an intersection.
This saves us one string comparison for each coordinate.

2. When executing on a dim_idx > 0, we may have many contiguous zeroes in
`r_bitmap`. In this scenario, we can do a `memcmp` in place of a loop that
checks each individual coordinate. This size of this `memcmp` is fixed to
256.

3. When executing on dim_idx == 0, we know that coordinates are sorted. For
any arbitrary subrange in the list of coordinate values, we know that all values
in the subrange are identical if the first and last values are equal. In this
patch, the list of coordinate values are partitioned into 6 ranges. If the
first and last values match in one of these ranges, we perform a single computation
for `r_bitmap` and apply it to all values in the range.

* This patch has two fixed values, 256 (see #2) and 6 (see #3). I intentionally
did not put these into our misc/constants.h because these are only applicable
in this path.

** For #3, I tried an approach where I start with a single range and split it
for each time there is not a match. This could be done either recursively or
iteratively. I did not try doing it recursively because recursive routines
can not be inlined and would be too expensive. I tried an iterative implementation
and it was significantly slower than 6 fixed ranges because of the state it
had to track.
joe-maley pushed a commit that referenced this pull request May 3, 2021
Fixes #2201

This patch introduces a global lock on the `S3Client` constructor to prevent
the following race reported from TSAN:
```
WARNING: ThreadSanitizer: data race (pid=12)
  Write of size 8 at 0x7fe93658a1b0 by thread T16 (mutexes: write M144250874682678808, write M150443169551677688):
    #0 cJSON_ParseWithOpts external/aws/aws-cpp-sdk-core/source/external/cjson/cJSON.cpp:1006 (libexternal_Saws_Slibaws.so+0x299bef)
    #1 Aws::Utils::Json::JsonValue::JsonValue(std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&) external/aws/aws-cpp-sdk-core/source/utils/json/JsonSerializer.cpp:39 (libexternal_Saws_Slibaws.so+0x32d09c)
    #2 Aws::Config::EC2InstanceProfileConfigLoader::LoadInternal() external/aws/aws-cpp-sdk-core/source/config/AWSProfileConfigLoader.cpp:341 (libexternal_Saws_Slibaws.so+0x286d88)
    #3 Aws::Config::AWSProfileConfigLoader::Load() external/aws/aws-cpp-sdk-core/source/config/AWSProfileConfigLoader.cpp:47 (libexternal_Saws_Slibaws.so+0x282b4f)
    #4 Aws::Auth::InstanceProfileCredentialsProvider::Reload() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.cpp:265 (libexternal_Saws_Slibaws.so+0x235987)
    #5 Aws::Auth::InstanceProfileCredentialsProvider::RefreshIfExpired() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.cpp:283 (libexternal_Saws_Slibaws.so+0x23613b)
    #6 Aws::Auth::InstanceProfileCredentialsProvider::GetAWSCredentials() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.cpp:250 (libexternal_Saws_Slibaws.so+0x23be9a)
    #7 Aws::Auth::AWSCredentialsProviderChain::GetAWSCredentials() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProviderChain.cpp:35 (libexternal_Saws_Slibaws.so+0x244550)
    #8 Aws::Client::AWSAuthV4Signer::AWSAuthV4Signer(std::shared_ptr<Aws::Auth::AWSCredentialsProvider> const&, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool) external/aws/aws-cpp-sdk-core/source/auth/AWSAuthSigner.cpp:170 (libexternal_Saws_Slibaws.so+0x2262ed)
    #9 std::shared_ptr<Aws::Client::AWSAuthV4Signer> Aws::MakeShared<Aws::Client::AWSAuthV4Signer, std::shared_ptr<Aws::Auth::DefaultAWSCredentialsProviderChain>, char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&, bool>(char const*, std::shared_ptr<Aws::Auth::DefaultAWSCredentialsProviderChain>&&, char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&, bool&&) /usr/include/c++/9/ext/new_allocator.h:147 (libexternal_Saws_Slibaws.so+0x39a78b)
    #10 Aws::S3::S3Client::S3Client(Aws::Client::ClientConfiguration const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool, Aws::S3::US_EAST_1_REGIONAL_ENDPOINT_OPTION) external/aws/aws-cpp-sdk-s3/source/S3Client.cpp:134 (libexternal_Saws_Slibaws.so+0x39a78b)
    #11 std::shared_ptr<Aws::S3::S3Client>::shared_ptr<Aws::Allocator<Aws::S3::S3Client>, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool const&>(std::_Sp_alloc_shared_tag<Aws::Allocator<Aws::S3::S3Client> >, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&&, bool const&) /usr/include/c++/9/ext/new_allocator.h:147 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd)
    #12 std::shared_ptr<Aws::S3::S3Client> std::allocate_shared<Aws::S3::S3Client, Aws::Allocator<Aws::S3::S3Client>, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool const&>(Aws::Allocator<Aws::S3::S3Client> const&, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&&, bool const&) /usr/include/c++/9/bits/shared_ptr.h:702 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd)
    #13 std::shared_ptr<Aws::S3::S3Client> Aws::MakeShared<Aws::S3::S3Client, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool const&>(char const*, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&&, bool const&) external/aws/aws-cpp-sdk-core/include/aws/core/utils/memory/stl/AWSAllocator.h:106 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd)
    #14 tiledb::sm::S3::init_client() const external/com_github_tiledb_tiledb/tiledb/sm/filesystem/s3.cc:1104 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd)
```

---

TYPE: BUG
DESC: Fix race within `S3::init_client`
joe-maley pushed a commit that referenced this pull request May 3, 2021
Fixes #2201

This patch introduces a global lock on the `S3Client` constructor to prevent
the following race reported from TSAN:
```
WARNING: ThreadSanitizer: data race (pid=12)
  Write of size 8 at 0x7fe93658a1b0 by thread T16 (mutexes: write M144250874682678808, write M150443169551677688):
    #0 cJSON_ParseWithOpts external/aws/aws-cpp-sdk-core/source/external/cjson/cJSON.cpp:1006 (libexternal_Saws_Slibaws.so+0x299bef)
    #1 Aws::Utils::Json::JsonValue::JsonValue(std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&) external/aws/aws-cpp-sdk-core/source/utils/json/JsonSerializer.cpp:39 (libexternal_Saws_Slibaws.so+0x32d09c)
    #2 Aws::Config::EC2InstanceProfileConfigLoader::LoadInternal() external/aws/aws-cpp-sdk-core/source/config/AWSProfileConfigLoader.cpp:341 (libexternal_Saws_Slibaws.so+0x286d88)
    #3 Aws::Config::AWSProfileConfigLoader::Load() external/aws/aws-cpp-sdk-core/source/config/AWSProfileConfigLoader.cpp:47 (libexternal_Saws_Slibaws.so+0x282b4f)
    #4 Aws::Auth::InstanceProfileCredentialsProvider::Reload() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.cpp:265 (libexternal_Saws_Slibaws.so+0x235987)
    #5 Aws::Auth::InstanceProfileCredentialsProvider::RefreshIfExpired() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.cpp:283 (libexternal_Saws_Slibaws.so+0x23613b)
    #6 Aws::Auth::InstanceProfileCredentialsProvider::GetAWSCredentials() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.cpp:250 (libexternal_Saws_Slibaws.so+0x23be9a)
    #7 Aws::Auth::AWSCredentialsProviderChain::GetAWSCredentials() external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProviderChain.cpp:35 (libexternal_Saws_Slibaws.so+0x244550)
    #8 Aws::Client::AWSAuthV4Signer::AWSAuthV4Signer(std::shared_ptr<Aws::Auth::AWSCredentialsProvider> const&, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool) external/aws/aws-cpp-sdk-core/source/auth/AWSAuthSigner.cpp:170 (libexternal_Saws_Slibaws.so+0x2262ed)
    #9 std::shared_ptr<Aws::Client::AWSAuthV4Signer> Aws::MakeShared<Aws::Client::AWSAuthV4Signer, std::shared_ptr<Aws::Auth::DefaultAWSCredentialsProviderChain>, char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&, bool>(char const*, std::shared_ptr<Aws::Auth::DefaultAWSCredentialsProviderChain>&&, char const*&, std::__cxx11::basic_string<char, std::char_traits<char>, Aws::Allocator<char> > const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&, bool&&) /usr/include/c++/9/ext/new_allocator.h:147 (libexternal_Saws_Slibaws.so+0x39a78b)
    #10 Aws::S3::S3Client::S3Client(Aws::Client::ClientConfiguration const&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool, Aws::S3::US_EAST_1_REGIONAL_ENDPOINT_OPTION) external/aws/aws-cpp-sdk-s3/source/S3Client.cpp:134 (libexternal_Saws_Slibaws.so+0x39a78b)
    #11 std::shared_ptr<Aws::S3::S3Client>::shared_ptr<Aws::Allocator<Aws::S3::S3Client>, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool const&>(std::_Sp_alloc_shared_tag<Aws::Allocator<Aws::S3::S3Client> >, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&&, bool const&) /usr/include/c++/9/ext/new_allocator.h:147 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd)
    #12 std::shared_ptr<Aws::S3::S3Client> std::allocate_shared<Aws::S3::S3Client, Aws::Allocator<Aws::S3::S3Client>, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool const&>(Aws::Allocator<Aws::S3::S3Client> const&, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&&, bool const&) /usr/include/c++/9/bits/shared_ptr.h:702 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd)
    #13 std::shared_ptr<Aws::S3::S3Client> Aws::MakeShared<Aws::S3::S3Client, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy, bool const&>(char const*, Aws::Client::ClientConfiguration&, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy&&, bool const&) external/aws/aws-cpp-sdk-core/include/aws/core/utils/memory/stl/AWSAllocator.h:106 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd)
    #14 tiledb::sm::S3::init_client() const external/com_github_tiledb_tiledb/tiledb/sm/filesystem/s3.cc:1104 (libexternal_Scom_Ugithub_Utiledb_Utiledb_Slibtiledb.so+0x2879bd)
```

---

TYPE: BUG
DESC: Fix race within `S3::init_client`

Co-authored-by: Joe Maley <[email protected]>
ihnorton pushed a commit that referenced this pull request Aug 19, 2022
Typo in cpp README for C++ Core Guidelines
ihnorton pushed a commit that referenced this pull request Aug 24, 2022
Typo in cpp README for C++ Core Guidelines
davisp added a commit that referenced this pull request Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants