Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace iterators with slices (Non command buffer) #1665

Merged
merged 1 commit into from
Feb 4, 2018

Conversation

Skepfyr
Copy link
Contributor

@Skepfyr Skepfyr commented Dec 6, 2017

A more debatable half of #1604, which APIs use slices and which should use iterators is more up for consideration. My implementation for the Device iterators frequently just collects them anyway which to me suggests there isn't much point (apart from some idea of API uniformity).


This change is Reviewable

Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for amazing work!

Those type signatures became really scary... And if for command buffer we had to go the zero overhead route, it's not as important here, since device functions are (or can be) slow by nature of HW, and the backends aren't currently trying to avoid allocations either (even though, they could in the future).

I'm not sure if we should make this call.
Pros:

  • less overhead for device functions
    Cons:
  • scary API, documentation

&self,
_: &[pso::GraphicsPipelineDesc<'a, Backend>],
) -> Vec<Result<(), pso::CreationError>> {
fn create_graphics_pipeline<'a, T>(&self, _: T) -> Result<(), pso::CreationError>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just have &pso::GraphicsPipelineDesc<'a, Backend> here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the generic is completely unnecessary and doesn't provide any benefit.

@@ -328,7 +350,8 @@ impl d::Device<B> for Device {
gl.BindFramebuffer(target, 0);
}
if let Err(err) = self.share.check() {
panic!("Error creating FBO: {:?} for {:?} with attachments {:?}", err, pass, attachments);
//TODO: attachments has been consumed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be "have been"

@Skepfyr
Copy link
Contributor Author

Skepfyr commented Dec 24, 2017

I agree that the documentation would be a bit scary, but I'm not sure about API though; another pro is that this change means that the functions accept anything that looks vaguely like an iterator, which should simplify user's code.
However, I can't comment on how useful it actually is and it definitely does add some complexity.

let mut color_attachments = Vec::with_capacity(descs.len());
let mut info_specializations = Vec::with_capacity(descs.len() * NUM_STAGES);
let mut specialization_data = Vec::with_capacity(descs.len() * NUM_STAGES);
let mut info_stages = Vec::with_capacity(desc_len_guess);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I forget it over the holidays.. it crucial in the current implementation that these vectors don't reallocate memory because we have pointers into the vector elements floating around.

@Rhuagh
Copy link
Contributor

Rhuagh commented Jan 26, 2018

This needs some serious documentation to make it obvious what types are accepted imho.

@Skepfyr Skepfyr force-pushed the iterators-opt branch 5 times, most recently from 548506c to 69fc536 Compare January 31, 2018 14:26
@Skepfyr Skepfyr changed the title [WIP] Replace iterators with slices (Non command buffer) Replace iterators with slices (Non command buffer) Jan 31, 2018
@kvark
Copy link
Member

kvark commented Jan 31, 2018

Amazing work! I definitely appreciate the bidirectional implementations and iterators for methods accepting a single list. The create_renderpass and create_pipeline_layout are arguable.


Reviewed 27 of 27 files at r1.
Review status: all files reviewed at latest revision, 12 unresolved discussions.


examples/hal/compute/main.rs, line 145 at r1 (raw file):

    }));
    queue_group.queues[0].submit(submission, Some(&fence));
    device.wait_for_fence(&fence, !0);

really nice to have this shortcut


examples/hal/quad/main.rs, line 164 at r1 (raw file):

    let pipeline_layout = device.create_pipeline_layout(
        Some(&set_layout),

nit: could probably also do [set_layout]


src/backend/dx12/src/device.rs, line 444 at r1 (raw file):

    ) where
        F: FnMut(&pso::DescriptorWrite<B, R>, &mut Vec<d3d12::D3D12_CPU_DESCRIPTOR_HANDLE>),
        I: Iterator<Item=&'a pso::DescriptorSetWrite<'a, 'a, B, R>>,

should this be I: IntoIterator and I::Item: Borrow<pso::DescriptorSetWrite> or something?


src/backend/dx12/src/device.rs, line 787 at r1 (raw file):

    }

    fn create_render_pass<'a, IA, IS, ID>(

maybe we could apply some fair judgement and not try to optimize at least this method? no one is creating render passes on the fly (I hope)


src/backend/dx12/src/device.rs, line 814 at r1 (raw file):

        }

        let attachments = attachments.into_iter()

FTR, doesn't look like generic iterators are making any good for this specific implementation


src/backend/dx12/src/device.rs, line 939 at r1 (raw file):

    }

    fn create_pipeline_layout<IS, IR>(

similarly, I don't expect this to be called often


src/backend/vulkan/Cargo.toml, line 25 at r1 (raw file):

lazy_static = "0.2"
shared_library = "0.1"
ash = "=0.20.2"

this can be reverted now


src/hal/src/device.rs, line 203 at r1 (raw file):

    /// Create graphics pipelines.
    fn create_graphics_pipelines<'a, I>(

strange idea: maybe we can omit this from gfx-backend-empty now that we have default implementations? ;)


src/hal/src/device.rs, line 494 at r1 (raw file):

            WaitFor::All => {
                for fence in fences {
                    let elapsed_ms = as_ms(start.elapsed());

I'm a little concerned that his would prevent the expected behavior when the user passes in timeout of 0. We should still be checking all the fences, but this implementation may bail out in the process


src/hal/src/device.rs, line 505 at r1 (raw file):

            },
            WaitFor::Any => {
                let fences: Vec<_> = fences.into_iter().collect();

why do we need to collect here?


src/render/src/device.rs, line 312 at r1 (raw file):

    #[doc(hidden)]
    pub fn create_render_pass_raw<'a, IA, IS, ID>(

don't bother updating the renderer with iterators


src/render/src/pso.rs, line 179 at r1 (raw file):

    }

    pub fn write<'b, T: Bind<B>, I>(

similarly, here. Current consensus is that render is going away


Comments from Reviewable

Copy link
Contributor

@msiglreith msiglreith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Requires a few changes as requested by kvark otherwise ready to go from my side 👍

if as_ms(start.elapsed()) >= timeout_ms {
return false;
}
thread::sleep(time::Duration::from_millis(1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need issue a thread switch here, drivers will most likely do it internally anyways.

@msiglreith
Copy link
Contributor

@kvark I'm in favor of keeping the iterator interface everywhere.

  1. Unified API interface
  2. ptr -> collecting in portability -> reating a slice -> recollecting that slice in gfx-backendback into a ptr doesn't feel very nice. We can at least reduce the bottleneck here.
  3. Once we have const generics (in some not so near future) it would be a nice alternative (syntax and logic) to use fixed-size arrays for passing parameters.

@kvark
Copy link
Member

kvark commented Jan 31, 2018

Ok, I'm fine with going all-in for iterators. Let's rebase this, address my concerns, and proceed.

@Skepfyr
Copy link
Contributor Author

Skepfyr commented Feb 1, 2018

Review status: 16 of 25 files reviewed at latest revision, 13 unresolved discussions.


examples/hal/quad/main.rs, line 164 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

nit: could probably also do [set_layout]

Sadly no, set_layout is used later and [set_layout] moves it.


src/hal/src/device.rs, line 203 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

strange idea: maybe we can omit this from gfx-backend-empty now that we have default implementations? ;)

I'm not sure, not implementing either will probably do something funky at compile time, but unimplemented! panics at runtime. If you want I can but I feel like there is no harm leaving one there.


src/hal/src/device.rs, line 494 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

I'm a little concerned that his would prevent the expected behavior when the user passes in timeout of 0. We should still be checking all the fences, but this implementation may bail out in the process

I think this should work as expected now.


src/hal/src/device.rs, line 505 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

why do we need to collect here?

fences needs to be iterated over multiple times, the only other option would be to find a way of making the iterator Clone-able.


src/hal/src/device.rs, line 515 at r1 (raw file):

Previously, msiglreith wrote…

I don't think we need issue a thread switch here, drivers will most likely do it internally anyways.

Could this not be a problem for the Mutex simulated fences in Metal, it will now just spin?


Comments from Reviewable

@msiglreith
Copy link
Contributor

Review status: 16 of 25 files reviewed at latest revision, 13 unresolved discussions.


src/hal/src/device.rs, line 515 at r1 (raw file):

Previously, AIOOB (AIOOB) wrote…

Could this not be a problem for the Mutex simulated fences in Metal, it will now just spin?

wait_on_fence handling is a bit out of scope atm, but we should rather use native primitives for synchronization (in particular semaphores here) than trying to create our own with locks, loops, etc. IMO


Comments from Reviewable

@kvark
Copy link
Member

kvark commented Feb 1, 2018

Review status: 16 of 25 files reviewed at latest revision, 12 unresolved discussions.


src/hal/src/device.rs, line 203 at r1 (raw file):

Previously, AIOOB (AIOOB) wrote…

I'm not sure, not implementing either will probably do something funky at compile time, but unimplemented! panics at runtime. If you want I can but I feel like there is no harm leaving one there.

It's easy to try out, right? Just go to the empty backend, comment out a few methods, and see if it builds. I expect it to compile fine but go to the stack overflow when run. And this would be fine to me, since I don't think there is any value in guaranteeing the empty backend to instantly panic: it's only there to compile-test against.


src/hal/src/device.rs, line 494 at r1 (raw file):

Previously, AIOOB (AIOOB) wrote…

I think this should work as expected now.

Well, technically, no. The elapsed() here can already be non-zero.


src/hal/src/device.rs, line 505 at r1 (raw file):

Previously, AIOOB (AIOOB) wrote…

fences needs to be iterated over multiple times, the only other option would be to find a way of making the iterator Clone-able.

Oh I see now.


src/hal/src/device.rs, line 515 at r1 (raw file):

Previously, msiglreith wrote…

wait_on_fence handling is a bit out of scope atm, but we should rather use native primitives for synchronization (in particular semaphores here) than trying to create our own with locks, loops, etc. IMO

In general, either there is a simple working default implementation, or just defer this decision to the backends. In this case, there appear to be enough concerns to consider this a non-trivial implementaiton.


Comments from Reviewable

@Skepfyr
Copy link
Contributor Author

Skepfyr commented Feb 4, 2018

I think that its most things addressed. The only things that I can see that could be added are single variant versions of update_descriptor_sets, bind_graphics_descriptor_sets and bind_compute_descriptor_sets.

@kvark
Copy link
Member

kvark commented Feb 4, 2018

@aioob alright, shipit!
bors r+

bors bot added a commit that referenced this pull request Feb 4, 2018
1665: Replace iterators with slices (Non command buffer) r=kvark a=AIOOB

A more debatable half of #1604, which APIs use slices and which should use iterators is more up for consideration. My implementation for the Device iterators frequently just collects them anyway which to me suggests there isn't much point (apart from some idea of API uniformity).

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/gfx-rs/gfx/1665)
<!-- Reviewable:end -->
@kvark
Copy link
Member

kvark commented Feb 4, 2018

There appears to be general agreement on the topic. We can nail down the details (descriptor sets) in follow ups, if needed.

@bors
Copy link
Contributor

bors bot commented Feb 4, 2018

@bors bors bot merged commit 829843e into gfx-rs:master Feb 4, 2018
@Skepfyr Skepfyr deleted the iterators-opt branch February 4, 2018 13:55
@Skepfyr Skepfyr mentioned this pull request Apr 16, 2018
63 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants