Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial groundwork for a low level api #1125

Merged
merged 11 commits into from
Jan 26, 2017
Merged

Initial groundwork for a low level api #1125

merged 11 commits into from
Jan 26, 2017

Conversation

msiglreith
Copy link
Contributor

@msiglreith msiglreith commented Jan 6, 2017

Motivation

The next gen APIs (Vulkan and D3D12) offer an low level interface of the GPU in comparison to older APIs like OpenGL or D3D11. To avoid overhead due to high level abstraction we want to investigate the implementation of a thin wrapper above the mentioned low level APIs to address #1102.

Goals

This PR creates a new low level core API corell and two backend implementations for the low level core. corell is heavily based on the vulkan API and shares a lot of concepts like Instance, Surface, etc., but tries to also take d3d12 into account. In future steps further abstractions could be considered to reduce the amount of setup code.

Planned coverage for this PR:

  • Instance/Context
  • Surface
  • Swapchain
  • PhysicalDevice

Note: Also includes some dummy types for other objects.
format and memory are mainly copied from core.

Hoping for early feedback if this is the intended way to go with regards to #1102.

cc @MaikKlein for quick checking ash based vulkan implementation if you have time 😄

@kvark kvark self-requested a review January 6, 2017 19:41
}
}

const SURFACE_EXTENSIONS: &'static [&'static str] = &[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use Ash's extensions names but it is not necessary. Surface::name().as_ptr() would give you a const* i8.

// TODO: add an ash function for this
let properties = unsafe {
let mut out = mem::zeroed();
self.inner.0.fp_v1_0().get_physical_device_properties(device, &mut out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that I didn't wrap that function yet, I'll do it tomorrow. If you encounter that again tell me and I'll add them for you.

@MaikKlein
Copy link
Contributor

The ash usage looks good.

Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks incredibly bold, fresh, and promising! Looking forward to see more, and let me know if you face any trouble and/or need assistance.

}

pub trait CommandQueue {
type B: Backend;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the Backend points to all the things, like CommandQueue, Device, or even Resources. This is great, a real step forward from the Zoo of traits we got. But why do these traits have to point back at Backend? It seems like it would be more convenient to have type CommandBuffer: CommandBuffer here instead. Would there be any issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't cause any issues.
I noticed only that it might only become a bit more verbose (e.g https://github.com/msiglreith/gfx/blob/gfx_next/src/core_next/src/lib.rs#L101)

fn submit(&mut self, cmd_buffer: &<<Self as CommandQueue>::B as Backend>::CommandBuffer);
}

pub trait SwapChain {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GL, it looks like this would require glutin to be a part of the GL backend.
I would not mind, if not the fact that there is a few people who uses SDL/GLFW with gfx-rs at the moment.


/// Different types of a specific API.
pub trait Backend {
type CommandBuffer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is no CommandBuffer trait, perhaps we need some RenderEncoder, TransferEncoder and ComputeEncoder ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm currently not sure atm how to split up the different command buffer types, also how to properly support secondary command buffers and bundles. But this will be part of future work!

@homu
Copy link
Contributor

homu commented Jan 17, 2017

☔ The latest upstream changes (presumably #1140) made this pull request unmergeable. Please resolve the merge conflicts.

@msiglreith
Copy link
Contributor Author

Current state

New way to setup the backend:

fn main() {
    let window = winit::WindowBuilder::new()
        .with_dimensions(1024, 768)
        .build()
        .unwrap();

    // instantiate backend
    let instance = back::Instance::create();
    let physical_devices = instance.enumerate_physical_devices();
    let queue_descs = physical_devices[0].get_queue_families().iter().map(|family| { (family, family.num_queues()) }).collect();

    // build a new device and associated command queues
    let (device, queues) = physical_devices[0].open(queue_descs);

    let surface = back::Surface::from_window(&window, &instance);
    let mut swap_chain = surface.build_swapchain::<ColorFormat>(1024, 768, &queues[0]);

    'main: loop {
        for event in window.poll_events() {

        }

        let frame = swap_chain.acquire_frame();

        // rendering

        // present frame
        swap_chain.present();
    }
}

Instance is the main entry point for an application. It's similar to an OpenGL context to some extent. We'll mainly use it for enumerating the available GPU devices on the system.

PhyiscalDevice and Surface are host abstraction of the corresponding gpu device and window. They are needed to build Device and Swapchain (driver/application-level counterparts).

Every GPU and virtual adapter (e.g d3d12 WARP) will be represented as PhysicalDevice and the user can query information about these devices. Probably most interesting properties are the supported command queues, which will be created together with the device.

Design decisions

Thread behavior (Vulkan)

Some objects need to be externally synchronized (e.g vkDevice on destruction or vkQueue on submit).

I tried to avoid locking for these objects to minimize the overhead and enforced the synchronization using Arc/Rc. Reference counting also has some additional overhead but it should only used for 'large' objects like device and command queue, which internals won't be cloned very often (on startup usually).

An alternative approach would be trying to remove all kind of reference counting and entirely relay on references &/& mut, but this might become very tricky to ensure the lifetimes and handle the trait interfaces.

Command queues

Only partially linked to this PR but I think worth to discuss early on.
Vulkan and D3D12 have different kind of command queues (also Metal's encoders, but I'm not familiar with the API):

  • D3D12 allows 3 different queue types/families (http://images.slideplayer.com/11/3326273/slides/slide_20.jpg) with an hierarchical: Graphics queues support also compute and transfer/copy operations.
  • Vulkan slightly differs: Graphics and Compute queues also support transfer/copy operations. If a physical device exposes a graphics queue, it also has to support a graphics+compute queue.

Possible queue combinations: Graphics/Copy, Graphics/Compute/Copy, Compute/Copy, Copy
Command buffers of the different types can then be submitted to the queues.
Difficulty here is the representation of the command buffers and command queues. We can either try to separate the different types to ensure at compile time that e.g a compute command buffer doesn't call draw or detect it at runtime.

Minor issues

If I forgot to mention something I'll add it later!
Any feedback appreciated!

@Bastacyclop
Copy link
Contributor

Bastacyclop commented Jan 17, 2017

Often returning references to Vec<_>

Why not return iterators like QueueFamiliyIter or QueueFamilies (can be type aliases of Vec iterator) ?

@kvark
Copy link
Member

kvark commented Jan 17, 2017

@msiglreith thanks for the awesome writup!

let queue_descs = physical_devices[0].get_queue_families().iter().map(|family| { (family, family.num_queues()) }).collect();

Why do we need to process this? Could physical_devices[0].open just accept queue families directly?

let surface = back::Surface::from_window(&window, &instance);

This might look better as let surface = instance.create_surface(&window);

let mut swap_chain = surface.build_swapchain::<ColorFormat>(1024, 768, &queues[0]);

Doesn't surface already know about the size (or has the size)?

Every GPU and virtual adapter (e.g d3d12 WARP) will be represented as PhysicalDevice and the user can query information about these devices

Nice to see d3d concepts fitting into the queue families!

Reference counting also has some additional overhead but it should only used for 'large' objects like device and command queue, which internals won't be cloned very often (on startup usually).

It sounds reasonable, yes.

Vulkan and D3D12 have different kind of command queues (also Metal's encoders, but I'm not familiar with the API):

Metal's encoders are close to our Encoder, it's not related to the command queue.

Difficulty here is the representation of the command buffers and command queues.

I have some ideas, will express them in the follow-ups.

I'm not totally happy with the name PhysicalDevice as it can also includes virtual devices like the mentioned WARP implementation, but couldn't find a good name (maybe adapter?)

Yes, Adapter sounds good!

Often returning references to Vec<_>

I'm not aware of any valid use case to return &Vec<>. If you have these queue families locally, you can return slices &[].

@msiglreith
Copy link
Contributor Author

@Bastacyclop @kvark First of all thank you for the feedback!

Why not return iterators like QueueFamiliyIter or QueueFamilies (can be type aliases of Vec iterator) ?

I'm not aware of any valid use case to return &Vec<>. If you have these queue families locally, you can return slices &[].

Probably will try to use iterators and slices as fallback depending on the complexity and usability.

Why do we need to process this? Could physical_devices[0].open just accept queue families directly?

In order to specify how many queues of a certain family we want to create. D3D12 theoretically supports infinite software queues and doesn't allows us to query how many hardware queues there are iirc.

This might look better as let surface = instance.create_surface(&window);

Yes, Adapter sounds good!

Agree! Going to change it.

Doesn't surface already know about the size (or has the size)?

Actually yes. Vulkan allows to create swapchains images with different dimensions from the surface (vk specs extensions, 29.5 Surface Queries), but I can't imagine a use-case at the moment. Also switching this to the default windows size.

@kvark
Copy link
Member

kvark commented Jan 19, 2017

@msiglreith the latest changes look good 👍

First api design steps for the low level core
Setup backend projects for dx12 and vulkan backend using the new ll api
* vulkan: Dummy implementations for basic structs
* d3d12: Instance and physical device enumeration
…3d12

* `Surface` abstracts the native window surface (highly coupled with swapchains)
* Document some core structs and add TODOs for future work
* Implement basic device and command queue creation for d3d12 backend
* Rename PhysicalDevice to Adapter
* Switch to Iterator based approach instead of returning &Vec
* Improve surface and swapchain creation
* Adapt to latest ash
* Try to implement xlib surface creation
* Fix rebasing errors
* Improve documentation for corell
* Mark more TODOs for future work
* Vulkan: Retrieve adapter name
* Potential fix for unix build
@msiglreith msiglreith changed the title [WIP] Initial groundwork for a low level api Initial groundwork for a low level api Jan 21, 2017
@msiglreith
Copy link
Contributor Author

I'm quite fine with the current design iteration.
Both backends should work fine (without crashing), Vulkan validation layers throw some errors due to lack of synchronization primitives and need to create the swapchain images.

All points mentioned in the feedback has been addressed iirc. Added TODOs throughout the source code for future work.

Upcoming topics to work on:

  • Shaders, RenderPasses and Pipelines
  • Resources and Memory
  • Descriptor handling
  • Synchronization
  • Command Buffers

@kvark
Copy link
Member

kvark commented Jan 26, 2017

My biggest concern here is the code duplication between our current gfx_core and the low-level counter-part. I'd prefer a much gradual transition, but it looks like we are too invested already, so merging this.

Note to all developers - any changes would have to go into both the current and new cores. Hence, we should make the swap and deprecate the old one as soon as possible!

@homu r+

@homu
Copy link
Contributor

homu commented Jan 26, 2017

📌 Commit 76d65f0 has been approved by kvark

homu added a commit that referenced this pull request Jan 26, 2017
 Initial groundwork for a low level api

## Motivation
The next gen APIs (Vulkan and D3D12) offer an low level interface of the GPU in comparison to older APIs like OpenGL or D3D11. To avoid overhead due to high level abstraction we want to investigate the implementation of a thin wrapper above the mentioned low level APIs to address #1102.

## Goals
This PR creates a new low level core API `corell` and two backend implementations for the low level core. corell is heavily based on the vulkan API and shares a lot of concepts like `Instance`, `Surface`, etc., but tries to also take d3d12 into account. In future steps further abstractions could be considered to reduce the amount of setup code.

Planned coverage for this PR:
- [x] Instance/Context
- [x] Surface
- [x] Swapchain
- [x] PhysicalDevice

Note: Also includes some dummy types for other objects.
`format` and `memory` are mainly copied from `core`.

Hoping for early feedback if this is the intended way to go with regards to #1102.

cc @MaikKlein for quick checking ash based vulkan implementation if you have time 😄
@homu
Copy link
Contributor

homu commented Jan 26, 2017

⌛ Testing commit 76d65f0 with merge eb46bdb...

@homu
Copy link
Contributor

homu commented Jan 26, 2017

☀️ Test successful - status

@homu homu merged commit 76d65f0 into gfx-rs:master Jan 26, 2017
JohnColanduoni added a commit to JohnColanduoni/gfx that referenced this pull request Jun 16, 2017
a404273 Auto merge of gfx-rs#1163 - Nitori-:short-attribs, r=kvark
d1b4ccf Fix vertex attrib size for 16-bit attributes with more than one component.
69cffe3 Auto merge of gfx-rs#1160 - plyhun:master, r=kvark
c6c36ae DX11 resize fix
eb46bdb Auto merge of gfx-rs#1125 - msiglreith:corell, r=kvark
827deea Auto merge of gfx-rs#1154 - tocubed:adjacency-info, r=kvark
634155e Auto merge of gfx-rs#1153 - Androcas:doc, r=kvark
dfe6091 Auto merge of gfx-rs#1158 - bvssvni:pod, r=kvark
9c1e943 Implement `Pod` for tuples
d87d688 Add primitives with adjacency information
4720b2c Document the directory structure
76d65f0 [ll] Implement frame acquisition for swapchains
2b14450 Auto merge of gfx-rs#1150 - CensoredUsername:bad_driver_GL_ACTIVE_UNIFORM_MAX_LENGTH, r=kvark
5020cdc Work around nonconforming OpenGL drivers returning the length of a uniform name without the trailing null byte.
e0a965e [ll] Experimental unix surface support for vulkan
795465b [ll] Cleanup parts of the low level api
0840c9a [ll] Add queue families and work on swapchain implementation for vulkan
2b1c0d5 [ll] Implement device and command queue creation for vulkan Temporary fix ash dependency path
2ab5f31 [ll] Implement surface creation for vulkan backend
3cd6964 [ll] Implement surface and swapchain handling for d3d12
ec61901 [ll] Introduce surface abstraction and implement device creation on d3d12 * `Surface` abstracts the native window surface (highly coupled with swapchains) * Document some core structs and add TODOs for future work * Implement basic device and command queue creation for d3d12 backend
b5683f9 [ll] Vulkan: Implement `Instance` creation
6fd4f31 [ll] Start implementing d3d12 and vulkan backends * vulkan: Dummy implementations for basic structs * d3d12: Instance and physical device enumeration
6a19f59 [ll] Initial commit for low level core api First api design steps for the low level core Setup backend projects for dx12 and vulkan backend using the new ll api
fd468f6 Auto merge of gfx-rs#1147 - Bastacyclop:access, r=kvark
8cebd46 Auto merge of gfx-rs#1146 - kvark:index, r=kvark
d12d4f8 Use HashSet for AccessInfo
43d495f Tracking the index buffer mapping
ab4c921 Auto merge of gfx-rs#1143 - kvark:metal_create, r=kvark
844bc07 Auto merge of gfx-rs#1144 - burtonageo:master, r=kvark
9e72946 Make the CreateProgramError a proper struct which impls the `Error` trait
cbc9c8a [Metal] Better creation error tracking

git-subtree-dir: vendor/gfx
git-subtree-split: a404273
adamnemecek pushed a commit to adamnemecek/gfx that referenced this pull request Apr 1, 2021
1125: Fix UB in populating dynamic offsets r=kvark a=kvark

**Connections**
Related to the traces in gfx-rs#1123

**Description**
The dynamic offsets arrays appear to be empty in the traces, which makes no sense. We are populating them unconditionally, and I double-checked on the `shadow` example that the traces we record contain the offsets. So my only guess is that the reported traces are taken from a release build, and our code that does `slice::from_raw_parts()` on the offsets is UB because the pointer can be garbage (if length is 0). This is what the PR fixes.

**Testing**
Untested

Co-authored-by: Dzmitry Malyshau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants