-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ll][WIP] Gpu, Device and handles for render #1467
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was quick! Looking forward to future progress!
A few general points early on.
src/render/src/handle.rs
Outdated
macro_rules! resource_handle { | ||
($handle:ident, $guard:ident, $resource:ident) => { | ||
#[derive(Clone, Debug)] | ||
struct $guard<B: Backend>(Option<B::$resource>, GarbageSender<B>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind explaining why we need Option<_>
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need ownership on Drop
to send the resource on the garbage channel for future cleanup.
src/render/src/lib.rs
Outdated
|
||
pub struct Gpu<B: Backend> { | ||
device: Device<B>, | ||
queue: CommandQueue<B, core::General>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be General
or Graphics
. OpenGL backend will only expose graphics queues if there is no compute support,
Else I think every backend exposes general but need to double-check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a third Compute
-only alternative be useful for some use cases ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say barely
.iter() | ||
.map(|&(ref family, qtype)| (family, qtype, family.num_queues()) ) | ||
.collect::<Vec<_>>(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow we need to filter for queues which can be used for presentation with the display surface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be easier with #1468
src/render/src/lib.rs
Outdated
} | ||
} | ||
|
||
impl<B: Backend> ops::Deref for Gpu<B> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might rather expose explicit get_device
/ref_device
/..(?) functions than Deref
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will become a bit more verbose for users that are not actually using the Device
split, but I agree that the use of Deref
might be a bit abusive here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice start!
src/render/src/device.rs
Outdated
|
||
#[derive(Clone)] | ||
pub struct Device<B: Backend> { | ||
pub(crate) device: B::Device, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convention would probably be to name it raw
src/render/src/device.rs
Outdated
pub struct Device<B: Backend> { | ||
pub(crate) device: B::Device, | ||
// TODO: could be shared instead of cloned | ||
pub(crate) heap_types: Vec<HeapType>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do all of them need to be pub(crate)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was mainly to avoid writing a redundant Device::new
but I can remove them.
src/render/src/device.rs
Outdated
|
||
impl<B: Backend, F: Device<B>> DeviceExt<B> for F {} | ||
*/ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: newline
src/render/src/handle.rs
Outdated
} | ||
|
||
#[derive(Clone, Debug)] // TODO: PartialEq, Eq, Hash | ||
pub struct $handle<B: Backend>(Arc<$guard<B>>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might need more stuff carried out by the handle, e.g. stuff like BufferInfo
, ProgramInfo
, etc
src/render/src/handle.rs
Outdated
resource_handle! { ConstantBufferView, ConstantBufferViewGuard, ConstantBufferView } | ||
resource_handle! { ShaderResourceView, ShaderResourceViewGuard, ShaderResourceView } | ||
resource_handle! { UnorderedAccessView, UnorderedAccessViewGuard, UnorderedAccessView } | ||
resource_handle! { Sampler, SamplerGuard, Sampler } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please configure your editor to add new lines automatically
src/render/src/handle.rs
Outdated
} | ||
} | ||
|
||
resource_handle! { ShaderLib, ShaderLibGuard, ShaderLib } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macro parameters should be different, i.e. {Image, ImageInfo}
.
The 3rd parameter matches the 1st, so no need to provide it.
The 2nd parameter could be avoided if we have mod guard { //all the guards with the same name as their resources
, so that guard::Image
is the one being a part of Image
handle. This may require specifying all the handle types in a single macro invocation, so that you can open and close the guard
mod once.
src/render/src/lib.rs
Outdated
for garbage in self.garbage.try_iter() { | ||
use handle::Garbage::*; | ||
match garbage { | ||
ShaderLib(sl) => dev.destroy_shader_lib(sl), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We lack documentation for the core
api so you probably aren't aware of it but resources need to externally synchronized (in vulkan terms) on destruction. This means that they (or derivates like image views) must not be in flight anymore/processed by the GPU on destruction. This requires fencing to create a sync point between CPU-GPU. To hide costs and reduce blocking by fences the usual approach is to have basically a ringbuffer of 'frames' (e.g size of 3). The idea then would be to re-use existing sync points caused by fences from these frames and cleanup at frame boundaries with a bit of delay: e.g queue up for destruction at frame 0, waiting until beginning of the next frame 0 to destroy. This gives the GPU 2 frames time to catch up and signal frame 0 fence.
The concept extends to command buffers/pools as well. In our quad
example we wait until the GPU idles after every frame which is slow but currently more a testing ground than best practice.
materials:
- http://developer.download.nvidia.com/gameworks/events/GDC2016/Vulkan_Essentials_GDC16_tlorach.pdf (slide 25, graphical visualization)
- https://www.khronos.org/assets/uploads/developers/library/2016-siggraph/3D-BOF-SIGGRAPH_Jul16.pdf (slide 54, example approach from UE4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. I think holding resources handles until synchronisation has been done would be a good way to ensure that cleanup is not done early.
src/render/src/handle.rs
Outdated
macro_rules! define_resources { | ||
($($name:ident: $info:path,)*) => { | ||
pub enum Garbage<B: Backend> { | ||
$( $name(B::$name), )* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
$( $name(B::$name), )* | ||
} | ||
|
||
pub mod inner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it have to be pub
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I'm using a type alias for the handle and not a wrapper, I think so. It also allows to call resource()
and info()
without any boilerplate on our side. It's the equivalent of the old buffer::Raw
for example.
|
||
pub mod raw { | ||
use std::sync::Arc; | ||
$( pub type $name<B> = Arc<super::inner::$name<B>>; )* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, cool
.iter() | ||
.map(|&(ref family, qtype)| (family, qtype, family.num_queues()) ) | ||
.collect::<Vec<_>>(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be easier with #1468
heap_types, | ||
memory_heaps, | ||
.. | ||
} = adapter.open(&queue_descs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: there is open_with
, which is a bit shorter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great start, thank you!
Working on #1466, starting with a fresh
cleanup()
implementation.I commented out a lot of code to be able to compile it.