-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow nvproxy to be enabled on platform/kvm #11436
Labels
type: enhancement
New feature or request
Comments
copybara-service bot
pushed a commit
that referenced
this issue
Feb 6, 2025
Updates #11436 PiperOrigin-RevId: 723723715
copybara-service bot
pushed a commit
that referenced
this issue
Feb 6, 2025
Updates #11436 PiperOrigin-RevId: 723723715
copybara-service bot
pushed a commit
that referenced
this issue
Feb 6, 2025
Updates #11436 PiperOrigin-RevId: 724028319
copybara-service bot
pushed a commit
that referenced
this issue
Feb 6, 2025
Updates #11436 PiperOrigin-RevId: 724028319
copybara-service bot
pushed a commit
that referenced
this issue
Feb 7, 2025
Updates #11436 PiperOrigin-RevId: 724028319
copybara-service bot
pushed a commit
that referenced
this issue
Feb 7, 2025
Updates #11436 PiperOrigin-RevId: 724028319
copybara-service bot
pushed a commit
that referenced
this issue
Feb 8, 2025
Updates #11436 PiperOrigin-RevId: 724028319
copybara-service bot
pushed a commit
that referenced
this issue
Feb 8, 2025
This has no effect (outside of debug logging) until cl/723723715. Updates #11436 PiperOrigin-RevId: 723723714
copybara-service bot
pushed a commit
that referenced
this issue
Feb 8, 2025
Updates #11436 PiperOrigin-RevId: 723723715
copybara-service bot
pushed a commit
that referenced
this issue
Feb 13, 2025
Updates #11436 PiperOrigin-RevId: 724028319
copybara-service bot
pushed a commit
that referenced
this issue
Feb 13, 2025
This has no effect (outside of debug logging) until cl/723723715. Updates #11436 PiperOrigin-RevId: 723723714
copybara-service bot
pushed a commit
that referenced
this issue
Feb 13, 2025
Updates #11436 PiperOrigin-RevId: 723723715
copybara-service bot
pushed a commit
that referenced
this issue
Feb 13, 2025
This has no effect (outside of debug logging) until cl/723723715. Updates #11436 PiperOrigin-RevId: 723723714
copybara-service bot
pushed a commit
that referenced
this issue
Feb 13, 2025
Updates #11436 PiperOrigin-RevId: 723723715
copybara-service bot
pushed a commit
that referenced
this issue
Feb 13, 2025
Updates #11436 PiperOrigin-RevId: 724028319
copybara-service bot
pushed a commit
that referenced
this issue
Feb 14, 2025
Updates #11436 PiperOrigin-RevId: 724028319
copybara-service bot
pushed a commit
that referenced
this issue
Feb 14, 2025
Updates #11436 PiperOrigin-RevId: 724028319
copybara-service bot
pushed a commit
that referenced
this issue
Feb 14, 2025
This has no effect (outside of debug logging) until cl/723723715. Updates #11436 PiperOrigin-RevId: 723723714
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Per https://github.com/google/gvisor/blob/master/g3doc/proposals/nvidia_driver_proxy.md, nvproxy is currently incompatible with platform/kvm for two reasons:
platform/kvm configures all guest page table entries to enable write-back caching, which is not generally correct for device mappings. On Intel CPUs, KVM mostly mitigates this by mapping uncacheable and write-combining pages as UC in EPT1; the hardware takes roughly the intersection of the EPT and guest page table memory type2. On AMD CPUs, the hardware has the same capability3, but KVM does not configure it4, so guest page table entires must set memory type correctly to obtain correct behavior. I don't think plumbing memory type through the sentry should be too difficult, but matching the kernel driver's memory types will be tricky; we will probably have to approximate conservatively in some cases, which may degrade performance.
Mappings of any given file offset in
/dev/nvidia-uvm
must be at the matching virtual address. In KVM, providing mappings of/dev/nvidia-uvm
to the guest requires mapping it into the KVM-using process' (the sentry's) address space and then forwarding it into the guest using a KVM memslot. Thus any UVM mapping at an offset that conflicts with an existing sentry mapping is unimplementable. AFAIK, this only affects CUDA - Vulkan and NVENC/DEC do not use UVM - so this should no longer block use of nvproxy with platform/kvm in any case. But it is still a problem for CUDA.Experimentally:
/dev/nvidia-uvm
at a fixed address (incuda_malloc_test
this happens to be 0x205000000, but I'm unsure if this is consistent between binaries). This happens to work (at least in my minimal testing) ifnvproxy.uvmFDMemmapFile.MapInternal()
is changed to attempt a mapping usingMAP_FIXED_NOREPLACE
.cudaMallocManaged()
reserves application address space usingmmap(MAP_PRIVATE|MAP_ANONYMOUS)
and then overwrites the reservation mapping with aMAP_FIXED
mapping of/dev/nvidia-uvm
, which sometimes collides with an existing sentry mapping depending on ASLR for both the sentry and application.Options:
uvmFDMemmapFile.MapInternal()
, which will unconditionally cause CUDA binaries to fail on platform/kvm.MAP_FIXED_NOREPLACE
for the sentry mapping. This should cause CUDA binaries that don't usecudaMallocManaged()
to succeed (good), but will cause CUDA binaries that do to flake (bad). This is probably unacceptable for use cases that run arbitrary GPU workloads, but might be fine for others; AFAIK use ofcudaMallocManaged()
is uncommon for performance reasons.MAP_FIXED_NOREPLACE
, and use a custom ELF loader to load the sentry in an address range that is less likely to collide with application addresses. I'm not sure this would actually work since IIUC the interfering sentry mappings are from runtime mmaps.MAP_FIXED_NOREPLACE
, and also try to avoid returning application mappings that would collide with existing sentry mappings. This is racy and leaks information about the sentry to applications, so it's probably undesirable.NVIDIA_DRIVER_CAPABILITIES
containscompute
. I mention this option mostly to rule it out; some containers in practice specifyNVIDIA_DRIVER_CAPABILITIES=all
even if only e.g. graphics support is required, and in fact NVIDIA's Vulkan support requireslibnvidia-gpucomp.so
whichlibnvidia-container
only mounts when--compute
is specified5 so this is necessary!Is this feature related to a specific bug?
No response
Do you have a specific solution in mind?
No response
Footnotes
Linux:
arch/x86/kvm/mmu/spte.c:make_spte()
=>kvm_is_mmio_pfn()
,arch/x86/kvm/vmx/vmx.c:vmx_get_mt_mask()
↩Intel 64 and IA-32 Software Developer Manual, Vol. 3, Sec. 30.3.7.2 "Memory Type Used for Translated Guest-Physical Addresses" ↩
AMD64 Architecture Programmer's Manual, Vol. 2, Sec. 15.25.8 "Combining Memory Types, MTRRs" ↩
SVM does not set
shadow_memtype_mask
or implement theget_mt_mask
static call; see also fc07e76ac7ff ('Revert "KVM: SVM: use NPT page attributes"') ↩https://github.com/NVIDIA/libnvidia-container/blob/95d3e86522976061e856724867ebcaf75c4e9b60/src/nvc_info.c#L85 ↩
The text was updated successfully, but these errors were encountered: