You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In OpenSHMEM 1.6 and earlier, all remotely accessible memory is symmetric. This means that a remotely accessible object exists on all PE. The OpenSHMEM API refers to a particular instance of an object by the combination of the PE number and the local address of the object.
Remotely accessible objects include global and static variables and objects allocated in the symmetric heap. In the case of globals, every PE has the same set of globals in the same order, because every PE is running an instance of the same program. In the case of objects in the symmetric heap, every PE has the same set of objects in the same order because allocation is a collective operation, executed in the same order by every PE.
It may be, due to address space layout randomization (ASLR), the virtual address of the global segment may be different on different PEs, but the OpenSHMEM runtime adjusts for this. It may be that the virtual address of the symmetric heap may be different on different PEs, but the runtime adjusts for this.
Implementation by shared memory
As an example, some set of PEs may be on the same node, or otherwise accessible by loads and stores. A PE might compute the virtual address of an object in such a shared PE by subtracting the base of the local symmetric heap from the object address to obtain an offset within the symmetric heap, and then adding the base of the mapping to the remote symmetric heap, to obtain a local virtual address for the remote object. This works because the objects in the symmetric heap are present in the same order on all PEs
Implementation by memory registration keys.
Each PE may register its local symmetric heap with the local RDMA capable NIC. The resulting memory registration key can be shared with all other PEs. Then to access a remote object, the local PE obtains the object offset at above, and makes a request to the remote NIC with the proper memory registration key and the object offset.
Non uniform size objects and single-PE objects
Intel made a proposal in 2024 (link?) to permit different PEs to collectively allocate an object in the symmetric heap which might have a different size on different PEs. The idea was that all PEs would allocate the maximum size virtual address space, but only the local size in physical address space. The symmetric heap might have holes on PEs with small instances of an object, but the virtual offsets of all objects would still be the same on all PEs. The objections to this are mostly esthetic, but J. Dinan pointed out that it might consume NIC resources to actually support holes like this.
Intel made a proposal in 2024 (link?) to permit objects in the symmetric heap to be allocated on a single PE rather than collectively. One way this might work is to allocate these "single" objects from the top down in the local symmetric heap while allocating collective objects from the bottom up on all PEs. The size of the symmetric heap might be different on different PEs to provide some PEs with enough space to allocate single-PE objects while collective allocations would work up to the minimum symmetric heap size across all PEs. This introduces objects that other PEs do not have, so they cannot make remote accesses using local addresses and a remote PE number. The proposal provides pointer-to-offset and offset-to-pointer functions. The idea is that the pe that owns the object calls pointer-to-offset, which subtracts the base of the local symmetric heap. The offset can then be shared in a variable of type off_t (or uintptr_t maybe), A pe wishing to make a remote reference to a single_pe object calls offset_to_pointer in the offset, and obtains a pointer value which might be past the end of the local symmetric heap, but which can be used in an OpenSHMEM call provided that the PE given is that of the owner of the object.
This proposal uses offsets in the single symmetric heap, but extends the idea of the heap to have holes or just different sizes on different PEs.
Memory Handles
The idea of a memory handle is that it encodes enough information about an object that a PE can use it to make remote references.
A handle might contain
An offset of an object within a region
A region identifier (if we have multiple regions like multiple symmetric heaps)
A PE or set of PEs (team? context?) which hosts the object
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
A discussion of memory handles
Current practice
In OpenSHMEM 1.6 and earlier, all remotely accessible memory is symmetric. This means that a remotely accessible object exists on all PE. The OpenSHMEM API refers to a particular instance of an object by the combination of the PE number and the local address of the object.
Remotely accessible objects include global and static variables and objects allocated in the symmetric heap. In the case of globals, every PE has the same set of globals in the same order, because every PE is running an instance of the same program. In the case of objects in the symmetric heap, every PE has the same set of objects in the same order because allocation is a collective operation, executed in the same order by every PE.
It may be, due to address space layout randomization (ASLR), the virtual address of the global segment may be different on different PEs, but the OpenSHMEM runtime adjusts for this. It may be that the virtual address of the symmetric heap may be different on different PEs, but the runtime adjusts for this.
Implementation by shared memory
As an example, some set of PEs may be on the same node, or otherwise accessible by loads and stores. A PE might compute the virtual address of an object in such a shared PE by subtracting the base of the local symmetric heap from the object address to obtain an offset within the symmetric heap, and then adding the base of the mapping to the remote symmetric heap, to obtain a local virtual address for the remote object. This works because the objects in the symmetric heap are present in the same order on all PEs
Implementation by memory registration keys.
Each PE may register its local symmetric heap with the local RDMA capable NIC. The resulting memory registration key can be shared with all other PEs. Then to access a remote object, the local PE obtains the object offset at above, and makes a request to the remote NIC with the proper memory registration key and the object offset.
Non uniform size objects and single-PE objects
Intel made a proposal in 2024 (link?) to permit different PEs to collectively allocate an object in the symmetric heap which might have a different size on different PEs. The idea was that all PEs would allocate the maximum size virtual address space, but only the local size in physical address space. The symmetric heap might have holes on PEs with small instances of an object, but the virtual offsets of all objects would still be the same on all PEs. The objections to this are mostly esthetic, but J. Dinan pointed out that it might consume NIC resources to actually support holes like this.
Intel made a proposal in 2024 (link?) to permit objects in the symmetric heap to be allocated on a single PE rather than collectively. One way this might work is to allocate these "single" objects from the top down in the local symmetric heap while allocating collective objects from the bottom up on all PEs. The size of the symmetric heap might be different on different PEs to provide some PEs with enough space to allocate single-PE objects while collective allocations would work up to the minimum symmetric heap size across all PEs. This introduces objects that other PEs do not have, so they cannot make remote accesses using local addresses and a remote PE number. The proposal provides pointer-to-offset and offset-to-pointer functions. The idea is that the pe that owns the object calls pointer-to-offset, which subtracts the base of the local symmetric heap. The offset can then be shared in a variable of type off_t (or uintptr_t maybe), A pe wishing to make a remote reference to a single_pe object calls offset_to_pointer in the offset, and obtains a pointer value which might be past the end of the local symmetric heap, but which can be used in an OpenSHMEM call provided that the PE given is that of the owner of the object.
This proposal uses offsets in the single symmetric heap, but extends the idea of the heap to have holes or just different sizes on different PEs.
Memory Handles
The idea of a memory handle is that it encodes enough information about an object that a PE can use it to make remote references.
A handle might contain
Beta Was this translation helpful? Give feedback.
All reactions