Fix replay staging buffer binding for when the capture/replay devices are not the same. #1115

jacobv-nvidia · 2023-05-04T18:42:18Z

While testing the replay of trimmed capture files on devices other than the original capture device, I encountered an issue in the rebind allocator where allocations would fail, causing the replay to abort.

The initial loading allocations would fail, and, when running the replay with the --validate option, I would receive messages such as the following:

VUID-vkMapMemory-memory-00682(ERROR / SPEC): msgNum: -330527817 - Validation Error: [ VUID-vkMapMemory-memory-00682 ] Object 0: handle = 0x4b07580000000ca2, type = VK_OBJECT_TYPE_DEVICE_MEMORY; | MessageID = 0xec4c8bb7 | Mapping Memory without VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT set: VkDeviceMemory 0x4b07580000000ca2[]. The Vulkan spec states: memory must have been created with a memory type that reports VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT (https://vulkan.lunarg.com/doc/view/1.3.239.0/windows/1.3-extensions/vkspec.html#VUID-vkMapMemory-memory-00682)
    Objects: 1
	[0] 0x4b07580000000ca2, type: 8, name: NULL

The issue seems to be in the rebind allocator's "Direct"` binding functions, which are supposed to be performed without memory translation for the purposes of, for example, replay staging buffers. However, these functions were using the capture device's memory types instead of the replay device's.

This patch should fix the issue.

This commit fixes an issue with erroneous memory type selection when binding buffer memory without memory translation, e.g. in cases of binding replay staging buffers. Currently, the code retrieves the capture device's memory types, even though we want to bind the memory directly from the replay device. This issue led to various replay issues with mapping memory later on in the replay, and was particularly disruptive when attempting to replay a trimmed capture on a machine with a GPU of a different vendor than the capture system.

ci-tester-lunarg · 2023-05-04T18:42:20Z

Author jacobv-nvidia not on autobuild list. Waiting for curator authorization before starting CI build.

CLAassistant · 2023-05-04T18:42:27Z

All committers have signed the CLA.

ci-tester-lunarg · 2023-05-04T19:11:42Z

CI gfxreconstruct build queued with queue ID 10524.

ci-tester-lunarg · 2023-05-04T19:34:37Z

CI gfxreconstruct build # 2769 running.

andrew-lunarg

Looks like a very useful fix, so thanks for the PR!

One point, It might be cleaner to extend BindBufferMemory and BindImageMemory with the extra bool parameter and give it a default value of true. Then the two new functions with slightly vague names wouldn't be needed.

jacobv-nvidia · 2023-05-04T20:21:24Z

@andrew-lunarg Yes, I was on the fence about that. I was trying to avoid cluttering the parent interface of the more general class VulkanResourceAllocator. Do you think it's worth it?

ci-tester-lunarg · 2023-05-04T20:26:30Z

CI gfxreconstruct build # 2769 passed.

bradgrantham-lunarg

Thank you for this PR! I have proposed a small change to the calling sequence in a comment.

bradgrantham-lunarg · 2023-05-26T20:35:59Z

framework/decode/vulkan_rebind_allocator.cpp

@@ -409,7 +410,9 @@ VkResult VulkanRebindAllocator::BindBufferMemory(VkBuffer               buffer,
        create_info.flags = 0;
        create_info.usage = GetBufferMemoryUsage(
            resource_alloc_info->usage,
-            capture_memory_properties_.memoryTypes[memory_alloc_info->original_index].propertyFlags,
+            is_direct_allocation


I apologize it's taken me so long to look at this. I think this function doesn't need to care whether the access is direct or not. I propose elevating the condition into the callers, changing the bool parameter to a const VkPhysicalDeviceMemoryProperties&, and then passing in the desired memory properties inside the direct and non-direct functions. Also, there's no need to call the sub-functions Helper, it's okay for them to be differentiated by parameters. @jacobv-nvidia may I make that change against your PR using GitHub's edit mechanism inside the PR viewer? An example is at https://github.com/bradgrantham-lunarg/gfxreconstruct/tree/brad-tweak-1115 that works against our local CI, but I'd like to run it on a known failure case if you have one.

Sure, go ahead. I agree, your new change seems like a much less hacky implementation of the fix.

bradgrantham-lunarg · 2023-05-26T21:41:35Z

I tried to take a trimmed capture of vkcube from llvmpipe and replay it on NVIDIA with --validate but got no validation error for vkMapMemory from SDK 1.3.246's binaries.@jacobv-nvidia , what's a good test case for this?

jacobv-nvidia · 2023-05-29T05:51:10Z

I tried to take a trimmed capture of vkcube from llvmpipe and replay it on NVIDIA with --validate but got no validation error for vkMapMemory from SDK 1.3.246's binaries.@jacobv-nvidia , what's a good test case for this?

I have just successfully tested it with vkcube, so that should work as a good test case, but you do need to:

use the vkcube option --use_staging
avoid including the first frame as part of the trim.

For example: python gfxrecon-capture-vulkan.py -o cube.gfxr --capture-frames 2-16 vkcube --use_staging

I haven't really tested on llvmpipe, so can't speak too much on that part. The issue was discovered and tested using an NVIDIA trace on an AMD card, but I suppose it's possible that the relevant memory type indices just happen to match up between llvmpipe and NVIDIA.

bradgrantham-lunarg · 2023-06-30T20:59:51Z

Merged as #1173.

We encountered this this week in our internal CI and this solved our problem so I prioritized it.

Thank you for finding this @jacobv-nvidia !

jacobv-nvidia · 2023-07-02T03:30:28Z

@bradgrantham-lunarg Apologies for the delay in responding, I thought I had replied to your earlier comment on your changes. Anyway, I appreciate the merge.

jacobv-nvidia and others added 2 commits April 23, 2023 14:46

Merge branch 'LunarG:dev' into fix-staging-buffer-memory-types

d626f65

andrew-lunarg reviewed May 4, 2023

View reviewed changes

locke-lunarg self-requested a review May 26, 2023 19:16

locke-lunarg approved these changes May 26, 2023

View reviewed changes

bradgrantham-lunarg requested changes May 26, 2023

View reviewed changes

bradgrantham-lunarg mentioned this pull request Jun 15, 2023

Brad rebase and refactor fix for staging buffer memory types jacobv-nvidia/gfxreconstruct#1

Closed

bradgrantham-lunarg mentioned this pull request Jun 29, 2023

Rebase and refactor fix for staging buffer memory types #1173

Merged

bradgrantham-lunarg closed this Jun 30, 2023

jacobv-nvidia deleted the fix-staging-buffer-memory-types branch March 18, 2024 15:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix replay staging buffer binding for when the capture/replay devices are not the same. #1115

Fix replay staging buffer binding for when the capture/replay devices are not the same. #1115

jacobv-nvidia commented May 4, 2023

ci-tester-lunarg commented May 4, 2023

CLAassistant commented May 4, 2023 •

edited

Loading

ci-tester-lunarg commented May 4, 2023

ci-tester-lunarg commented May 4, 2023

andrew-lunarg left a comment

jacobv-nvidia commented May 4, 2023 •

edited

Loading

ci-tester-lunarg commented May 4, 2023

bradgrantham-lunarg left a comment •

edited

Loading

bradgrantham-lunarg May 26, 2023

jacobv-nvidia May 29, 2023

bradgrantham-lunarg commented May 26, 2023

jacobv-nvidia commented May 29, 2023

bradgrantham-lunarg commented Jun 30, 2023

jacobv-nvidia commented Jul 2, 2023

Fix replay staging buffer binding for when the capture/replay devices are not the same. #1115

Fix replay staging buffer binding for when the capture/replay devices are not the same. #1115

Conversation

jacobv-nvidia commented May 4, 2023

ci-tester-lunarg commented May 4, 2023

CLAassistant commented May 4, 2023 • edited Loading

ci-tester-lunarg commented May 4, 2023

ci-tester-lunarg commented May 4, 2023

andrew-lunarg left a comment

Choose a reason for hiding this comment

jacobv-nvidia commented May 4, 2023 • edited Loading

ci-tester-lunarg commented May 4, 2023

bradgrantham-lunarg left a comment • edited Loading

Choose a reason for hiding this comment

bradgrantham-lunarg May 26, 2023

Choose a reason for hiding this comment

jacobv-nvidia May 29, 2023

Choose a reason for hiding this comment

bradgrantham-lunarg commented May 26, 2023

jacobv-nvidia commented May 29, 2023

bradgrantham-lunarg commented Jun 30, 2023

jacobv-nvidia commented Jul 2, 2023

CLAassistant commented May 4, 2023 •

edited

Loading

jacobv-nvidia commented May 4, 2023 •

edited

Loading

bradgrantham-lunarg left a comment •

edited

Loading