Skip to content

Commit

Permalink
Fix unsafe walking of DacEnumerableHash (#88063)
Browse files Browse the repository at this point in the history
Under the following set of conditions, it is possible for a lookup in the `DacEnumerableHash` to fail to find already present entries. 

Given Thread A which is growing the table, and thread B which is attempting to perform a lookup.

1. Thread B reads an `VolatileEntry*` from a bucket in the hashtable. Let this be EntryB. EntryB will have a next pointer which points to EntryThatWeNeedToFind, but it will not yet read that pointer value from EntryB.
2. Thread A reads the pointer to EntryB from the same bucket in the hashtable.
3. Thread A adds EntryB to the new hashtable
4. Thread A sets the bucket in the old hashtable to point to EntryThatWeNeedToFind
5. Thread A sets the next pointer in EntryB to point to NULL.
6. Thread B reads the next pointer, sees that the next pointer is NULL, and falls back to looking in the "new" hashtable for any possible updates
7. Thread B looks in the appropriate bucket in the "new" hashtable. That bucket does not yet contain EntryThatWeNeedToFind, as it hasn't yet been moved from the old hashtable.
8. Thread B returns NULL __*INCORRECTLY*__, indicating that there are no entries with the matching in them.

This change adjust how these linked lists work, but giving each one a version number which can be checked as the linked list finishes reading. The fix allows finding entries from an earlier set of buckets (as those can be in the new list temporarily, and will not interfere with finding the correct result), but will actively detect walking entries in the new list when attempting to walk the old entries. 

In addition, there is a drive by fix for an issue where if there are more than ~14,000,000 entries in the table, it allocates a full new list each time the list is added to.

Fixes #85688
  • Loading branch information
davidwrighton authored Jul 6, 2023
1 parent 681b798 commit 68b410f
Show file tree
Hide file tree
Showing 2 changed files with 182 additions and 16 deletions.
109 changes: 105 additions & 4 deletions src/coreclr/vm/dacenumerablehash.h
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,98 @@ class DacEnumerableHashTable
DacEnumerableHashValue m_iHashValue; // The hash value associated with the entry
};


// End sentinel logic
// End sentinels indicate the end of the linked list of VolatileEntrys of each bucket
// As the list can be mutated while readers are reading, we use a specific end sentinel
// per list to identify the end of the list instead of NULL. We take advantage of the
// concept that VolatileEntry values are aligned, and distinguish an end sentinel
// from a normal pointer by means of the low bit, and we use an incrementing value
// for the rest of the end sentinel so that we can detect when we finish walking
// a linked list if we found ourselves walking to the end of the expected list,
// to the end of a list on an older bucket (which is OK), or to the end of a list
// on a newer bucket. (which will require rewalking the entire list as we may have
// missed elements of the list that need to be found. On the newer buckets we may
// find an end sentinel for the new bucket list, or an end sentinel for a different
// bucket from the current buckets.)

static bool IsEndSentinel(PTR_VolatileEntry entry)
{
return IsEndSentinel(dac_cast<TADDR>(entry));
}

static bool IsEndSentinel(TADDR value)
{
return !!(value & 1);
}

static TADDR InitialEndSentinel()
{
return 1;
}

static TADDR IncrementBaseEndSentinel(TADDR previousSentinel)
{
auto result = previousSentinel + 2;
_ASSERTE(IsEndSentinel(result));
return result;
}

static TADDR ComputeEndSentinel(TADDR baseEndSentinel, DWORD bucketIndex)
{
return ((TADDR)bucketIndex << 6) | baseEndSentinel;
}

static DWORD BucketIndexFromEndSentinel(TADDR endSentinel)
{
_ASSERTE(IsEndSentinel(endSentinel));
return (DWORD)endSentinel >> 6;
}

static DWORD BucketsAgeFromEndSentinel(TADDR endSentinel)
{
_ASSERTE(IsEndSentinel(endSentinel));
return ((DWORD)endSentinel & 0x3E) >> 1;
}

static DWORD MaxBucketCountRepresentableWithEndSentinel()
{
#ifdef TARGET_64BIT
return 0xFFFFFFFF;
#else
return 0x03FFFFFF; // Bucket age and the IsEndSentinel bit take up 6 bits
#endif
}

static DWORD MaxBucketAge()
{
return 0x3E >> 1;
}

static bool AcceptableEndSentinel(PTR_VolatileEntry entry, TADDR expectedEndSentinel)
{
_ASSERTE(expectedEndSentinel != NULL);
_ASSERTE(entry != NULL);

TADDR endSentinelEntry = dac_cast<TADDR>(entry);

// Exactly matching the end sentinel
if (endSentinelEntry == expectedEndSentinel)
return true;

// An end sentinel from an earlier BucketAge is also OK. This can happen when the bucket in the
// new set of buckets temporarily has the remnants of a list from the old buckets.
if (BucketsAgeFromEndSentinel(endSentinelEntry) < BucketsAgeFromEndSentinel(expectedEndSentinel))
return true;

// If we reach this point, we either have found an end sentinel from a higher age set of buckets
// OR we've found the end from the wrong list on the current bucket.
_ASSERTE((BucketsAgeFromEndSentinel(endSentinelEntry) > BucketsAgeFromEndSentinel(expectedEndSentinel)) ||
(BucketsAgeFromEndSentinel(endSentinelEntry) == BucketsAgeFromEndSentinel(expectedEndSentinel)));

return false;
}

protected:
// This opaque structure provides enumeration context when walking the set of entries which share a common
// hash code. Initialized by BaseFindFirstEntryByHash and read/updated by BaseFindNextEntryByHash.
Expand All @@ -116,6 +208,9 @@ class DacEnumerableHashTable
TADDR m_pEntry; // The entry the caller is currently looking at (or NULL to begin
// with). This is a VolatileEntry* and should always be a target address
// not a DAC PTR_.
TADDR m_expectedEndSentinel; // A marker indicating which bucket list is being walked
// The algorihm may walk off of one list to another when the list is
// being updated, and this allows graceful handling of that case.
DPTR(PTR_VolatileEntry) m_curBuckets; // The bucket table we are working with.
};

Expand Down Expand Up @@ -207,24 +302,30 @@ class DacEnumerableHashTable
{
SUPPORTS_DAC;

return m_pBuckets;
return VolatileLoadWithoutBarrier(&m_pBuckets);
}

// our bucket table uses two extra slots - slot [0] contains the length of the table,
// slot [1] will contain the next version of the table if it resizes
static const int SLOT_LENGTH = 0;
static const int SLOT_NEXT = 1;
// normal slots start at slot #2
static const int SKIP_SPECIAL_SLOTS = 2;
static const int SLOT_ENDSENTINEL = 2;
// normal slots start at slot #3
static const int SKIP_SPECIAL_SLOTS = 3;

static DWORD GetLength(DPTR(PTR_VolatileEntry) buckets)
{
return (DWORD)dac_cast<TADDR>(buckets[SLOT_LENGTH]);
}

static TADDR BaseEndSentinel(DPTR(PTR_VolatileEntry) buckets)
{
return dac_cast<TADDR>(buckets[SLOT_ENDSENTINEL]);
}

static DPTR(PTR_VolatileEntry) GetNext(DPTR(PTR_VolatileEntry) buckets)
{
return dac_cast<DPTR(PTR_VolatileEntry)>(buckets[SLOT_NEXT]);
return dac_cast<DPTR(PTR_VolatileEntry)>(VolatileLoadWithoutBarrier(&buckets[SLOT_NEXT]));
}

// Loader heap provided at construction time. May be NULL (in which case m_pModule must *not* be NULL).
Expand Down
89 changes: 77 additions & 12 deletions src/coreclr/vm/dacenumerablehash.inl
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,17 @@ DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::DacEnumerableHashTable(Module *pModu
m_cEntries = 0;
PTR_VolatileEntry* pBuckets = (PTR_VolatileEntry*)(void*)GetHeap()->AllocMem(cbBuckets);
((size_t*)pBuckets)[SLOT_LENGTH] = cInitialBuckets;
((size_t*)pBuckets)[SLOT_ENDSENTINEL] = InitialEndSentinel();

TADDR endSentinel = BaseEndSentinel(pBuckets);

// All buckets are initially empty.
// Note: Memory allocated on loader heap is zero filled, and since we need end sentinels, fill them in now
for (DWORD i = 0; i < cInitialBuckets; i++)
{
DWORD dwCurBucket = i + SKIP_SPECIAL_SLOTS;
pBuckets[dwCurBucket] = dac_cast<PTR_VolatileEntry>(ComputeEndSentinel(endSentinel, dwCurBucket));
}

// publish after setting the length
VolatileStore(&m_pBuckets, pBuckets);
Expand Down Expand Up @@ -174,6 +185,18 @@ void DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::GrowTable()

// Make the new bucket table larger by the scale factor requested by the subclass (but also prime).
DWORD cNewBuckets = NextLargestPrime(cBuckets * SCALE_FACTOR);

// If NextLargestPrime is no longer incrementing,
// or the cBuckets can't be represented as a DWORD
// or the new bucket count can't be represented within and EndSentinel,
// or the bucket age has already reached its maximum value,
// just don't expand the table
if ((cNewBuckets == cBuckets) ||
(cBuckets > UINT32_MAX - SKIP_SPECIAL_SLOTS) ||
(SKIP_SPECIAL_SLOTS + cBuckets > MaxBucketCountRepresentableWithEndSentinel()) ||
(BucketsAgeFromEndSentinel(BaseEndSentinel(curBuckets)) == MaxBucketAge()))
return;

// two extra slots - slot [0] contains the length of the table,
// slot [1] will contain the next version of the table if it resizes
S_SIZE_T cbNewBuckets = (S_SIZE_T(cNewBuckets) + S_SIZE_T(SKIP_SPECIAL_SLOTS)) * S_SIZE_T(sizeof(PTR_VolatileEntry));
Expand All @@ -184,13 +207,27 @@ void DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::GrowTable()

// element 0 stores the length of the table
((size_t*)pNewBuckets)[SLOT_LENGTH] = cNewBuckets;
((size_t*)pNewBuckets)[SLOT_ENDSENTINEL] = IncrementBaseEndSentinel(BaseEndSentinel(curBuckets));
// element 1 stores the next version of the table (after length is written)
// NOTE: DAC does not call add/grow, so this cast is ok.
VolatileStore(&((PTR_VolatileEntry**)curBuckets)[SLOT_NEXT], pNewBuckets);

TADDR newEndSentinel = BaseEndSentinel(pNewBuckets);

_ASSERTE(BaseEndSentinel(curBuckets) != BaseEndSentinel(pNewBuckets));

// It is acceptable to walk a chain and find a sentinel from an older bucket, but not vice versa
_ASSERTE(AcceptableEndSentinel(dac_cast<PTR_VolatileEntry>(BaseEndSentinel(curBuckets)), BaseEndSentinel(pNewBuckets)));
_ASSERTE(!AcceptableEndSentinel(dac_cast<PTR_VolatileEntry>(BaseEndSentinel(pNewBuckets)), BaseEndSentinel(curBuckets)));

// All buckets are initially empty.
// Note: Memory allocated on loader heap is zero filled
// memset(pNewBuckets, 0, cNewBuckets * sizeof(PTR_VolatileEntry));
// Note: Memory allocated on loader heap is zero filled, and since we need end sentinels, fill them in now
for (DWORD i = 0; i < cNewBuckets; i++)
{
DWORD dwCurBucket = i + SKIP_SPECIAL_SLOTS;
pNewBuckets[dwCurBucket] = dac_cast<PTR_VolatileEntry>(ComputeEndSentinel(newEndSentinel, dwCurBucket));
}

// NOTE: DAC does not call add/grow, so this cast is ok.
VolatileStore(&((PTR_VolatileEntry**)curBuckets)[SLOT_NEXT], pNewBuckets);

// Run through the old table and transfer all the entries. Be sure not to mess with the integrity of the
// old table while we are doing this, as there can be concurrent readers!
Expand All @@ -203,21 +240,21 @@ void DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::GrowTable()
DWORD dwCurBucket = i + SKIP_SPECIAL_SLOTS;
PTR_VolatileEntry pEntry = curBuckets[dwCurBucket];

while (pEntry != NULL)
while (!IsEndSentinel(pEntry))
{
DWORD dwNewBucket = (pEntry->m_iHashValue % cNewBuckets) + SKIP_SPECIAL_SLOTS;
PTR_VolatileEntry pNextEntry = pEntry->m_pNextEntry;

PTR_VolatileEntry pTail = pNewBuckets[dwNewBucket];

// make the pEntry reachable in the new bucket, together with all the chain (that is temporary and ok)
if (pTail == NULL)
if (IsEndSentinel(pTail))
{
pNewBuckets[dwNewBucket] = pEntry;
}
else
{
while (pTail->m_pNextEntry)
while (!IsEndSentinel(pTail->m_pNextEntry))
{
pTail = pTail->m_pNextEntry;
}
Expand All @@ -229,7 +266,10 @@ void DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::GrowTable()
VolatileStore(&curBuckets[dwCurBucket], pNextEntry);

// drop the rest of the bucket after old table starts referring to it
VolatileStore(&pEntry->m_pNextEntry, (PTR_VolatileEntry)NULL);
// NOTE: this can cause a race condition where a reader thread (which is unlocked relative to this logic)
// can be walking this list, and stop early, failing to walk the rest of the chain. We use an incrementing
// end sentinel to detect that case, and repeat the entire walk.
VolatileStore(&pEntry->m_pNextEntry, dac_cast<PTR_VolatileEntry>(ComputeEndSentinel(newEndSentinel, dwNewBucket)));

pEntry = pNextEntry;
}
Expand Down Expand Up @@ -311,9 +351,10 @@ DPTR(VALUE) DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::BaseFindFirstEntryByHash

// Point at the first entry in the bucket chain that stores entries with the given hash code.
PTR_VolatileEntry pEntry = VolatileLoadWithoutBarrier(&curBuckets[dwBucket]);
TADDR expectedEndSentinel = ComputeEndSentinel(BaseEndSentinel(curBuckets), dwBucket);

// Walk the bucket chain one entry at a time.
while (pEntry)
while (!IsEndSentinel(pEntry))
{
if (pEntry->m_iHashValue == iHash)
{
Expand All @@ -323,6 +364,7 @@ DPTR(VALUE) DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::BaseFindFirstEntryByHash
// BaseFindNextEntryByHash can pick up the search where it left off.
pContext->m_pEntry = dac_cast<TADDR>(pEntry);
pContext->m_curBuckets = curBuckets;
pContext->m_expectedEndSentinel = dac_cast<TADDR>(expectedEndSentinel);

// Return the address of the sub-classes' embedded entry structure.
return VALUE_FROM_VOLATILE_ENTRY(pEntry);
Expand All @@ -332,6 +374,17 @@ DPTR(VALUE) DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::BaseFindFirstEntryByHash
pEntry = VolatileLoadWithoutBarrier(&pEntry->m_pNextEntry);
}

if (!AcceptableEndSentinel(pEntry, expectedEndSentinel))
{
// If we hit this logic, we've managed to hit a case where the linked list was in the process of being
// moved to a new set of buckets while we were walking the list, and we walked part of the list of the
// bucket in the old hash table (which is fine), and part of the list in the new table, which may not
// be the correct bucket to walk. Most notably, the situation that can cause this will cause the list in
// the old bucket to be missing items. Restart the lookup, as the linked list is unlikely to still be under
// edit a second time.
continue;
}

// in a rare case if resize is in progress, look in the new table as well.
// if existing entry is not in the old table, it must be in the new
// since we unlink it from old only after linking into the new.
Expand Down Expand Up @@ -368,7 +421,7 @@ DPTR(VALUE) DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::BaseFindNextEntryByHash(
iHash = pVolatileEntry->m_iHashValue;

// Iterate over the rest ot the bucket chain.
while ((pVolatileEntry = VolatileLoadWithoutBarrier(&pVolatileEntry->m_pNextEntry)) != nullptr)
while (!IsEndSentinel(pVolatileEntry = VolatileLoadWithoutBarrier(&pVolatileEntry->m_pNextEntry)))
{
if (pVolatileEntry->m_iHashValue == iHash)
{
Expand All @@ -379,6 +432,17 @@ DPTR(VALUE) DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::BaseFindNextEntryByHash(
}
}

if (!AcceptableEndSentinel(pVolatileEntry, pContext->m_expectedEndSentinel))
{
// If we hit this logic, we've managed to hit a case where the linked list was in the process of being
// moved to a new set of buckets while we were walking the list, and we walked part of the list of the
// bucket in the old hash table (which is fine), and part of the list in the new table, which may not
// be the correct bucket to walk. Most notably, the situation that can cause this will cause the list in
// the old bucket to be missing items. Restart the lookup, as the linked list is unlikely to still be under
// edit a second time.
return BaseFindFirstEntryByHashCore(pContext->m_curBuckets, iHash, pContext);
}

// check for next table must happen after we looked through the current.
VolatileLoadBarrier();

Expand Down Expand Up @@ -446,7 +510,7 @@ void DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::EnumMemoryRegions(CLRDataEnumMe
{
//+2 to skip "length" and "next" slots
PTR_VolatileEntry pEntry = curBuckets[i + SKIP_SPECIAL_SLOTS];
while (pEntry.IsValid())
while (!IsEndSentinel(pEntry) && pEntry.IsValid())
{
pEntry.EnumMem();

Expand Down Expand Up @@ -513,11 +577,12 @@ DPTR(VALUE) DacEnumerableHashTable<DAC_ENUM_HASH_ARGS>::BaseIterator::Next()
}

// If we found an entry in the last step return with it.
if (m_pEntry)
if (!IsEndSentinel(m_pEntry))
return VALUE_FROM_VOLATILE_ENTRY(dac_cast<PTR_VolatileEntry>(m_pEntry));

// Otherwise we found the end of a bucket chain. Increment the current bucket and, if there are
// buckets left to scan go back around again.
m_pEntry = NULL;
m_dwBucket++;
}

Expand Down

0 comments on commit 68b410f

Please sign in to comment.