-
-
Notifications
You must be signed in to change notification settings - Fork 35.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebGPU] drawIndirect and drawIndexedIndirect #28389
Comments
I've been implementing that exactly, and you may take an interest to #28103 in lieu of indirect draw or some of my GPU driven experiments with meshlets and culling/visibility tests particularly https://twitter.com/Cody_J_Bennett/status/1736555185541886407, https://twitter.com/Cody_J_Bennett/status/1730911419707842973. |
Hi @CodyJasonBennett I have came across your gpu-culling code before and if im understanding it only discards the fragment shader by setting the w value to 0 if not visible in the vertex shader. This is my issue with current solutions, same with BatchedMesh. With drawIndirect we can do something like (each is a compute pass):
After all done drawIndirect can be used with the GPU data to directly render the visible meshlets only, no need to go over invisible ones, also no gpu->cpu transfer is needed. Keep in mind that rendering the meshlets is not the issue, I have implemented something akin of an InstancedMesh for meshlets that can draw many instances in one draw call. The visibility culling is the problem, in non web environments this is easily solvable by using mesh shaders. Offtopic but big fan of four, planning on using it as a base for what im trying to do above. |
FWIW those experiments were as close as I could get to GPU driven rendering which is comparably trivial in WebGPU. The use of transform feedback performs culling at the instance or batch level and then the vertex shader will short circuit if the resulting instance buffer is zeroed. Doesn't work so great for virtualized geometry where you have a fixed number of vertices or vertex buffer memory; you want to move data and this can only be done with compute or carefully vectorized CPU code via WASM with GPU driver overhead from upload (introduces latency). May be the easiest path for you in the near term since you're already using METIS and Meshoptimizer. On the topic of indirect drawing in WebGPU, or more specifically the WebGL compatibility side, my knee-jerk thought is to fallback to multi-draw (hence the backlink to my PR), but the data is different. WebGPU expects an interleaved buffer with draw arguments, but multi-draw expects separate buffers per argument, and they have to be consumed on the CPU. It may be best to consider this feature WebGPU only and consider compatibility only when proven feasible (cc @RenaudRohlinger, curious of your opinion here). This is one of the flagship features of WebGPU people migrate specifically for, alongside GPU shared memory, atomics, multisampled textures, etc. There is no WebGL 2 equivalent, even if it can be ported in a strictly worse fashion. |
Interesting topic here! drawIndirect is also part of my roadmap for achieving a new type of 'static' scene that I have in mind for threejs. I have started working on it here, with the support of the CAD software Plasticity. Here is how I envision my roadmap:
Each feature on this roadmap should be cherry-pickable once shipped to three.js. |
I'm not sure how that relates to this issue. What's described here is indirect drawing which would benefit from an interface with compute. That seems like a backlog of issues which affect |
I won't hijack this issue any further, but I'm happy to explore this topic in WebGPU and later WebGL 2 (with expected latency). Today, I would lean with WASM + multi-draw which could use #28103. This is an area I've incidentally been studying which was the motivation behind all the prior work I linked. Awesome progress and great to see I'm not alone on the web, Bevy aside. |
I'm working with AIFanatic's repo. I brought this up to r168 and WebGPU and cleaned it up a lot. I do this very excessively with my ocean2 repo, which I have upgraded to r167.1 three days ago and with an error fix so that you can see the wireframe without issues. |
You should look into GPU driven rendering and storage buffers (with mixed precision since pure float data won't cut it). Mesh shaders can be emulated with compute shaders (with a performance penalty), but I'd think you also want software raster and visibility buffer. VKGuide has a good intro, but this leans into very specialized engine territory – which you could build on top of an interface for indirect drawing and storage memory which is what this issue describes. The rest is a separate exercise, probably a bit too involved for a PoC. |
If I was wrong I hereby apologize. I'm sorry. |
I'm being bad and overloading the topic. Just some resources for you since very few people are interested in this area, especially within the constraints of either three.js or WebGPU. It would be great if this could be expressed with three, but there's a lot to it paired with web/WebGPU limitations. |
As CodyJasonBennett mentioned to make this work at top performance it would need Multi draw indirect calls, this would need to be implemented at the WebGPU level, currently only Draw indirect calls are supported. The issue with draw indirect only calls is that the number of indices and vertices are fixed (think of instances), so meshes need to be split into the same number of triangles. This works well for meshes that have a lot of triangles but for a simple cube it would actually be calling the vertex shader multiple times unnecessarily. Emulating this behavior with WebGL can be done but the benefits are little to none since it always requires a GPU->CPU->GPU roundtrip (think of culling, if there are 100 meshes on the scene but 90 of them are occluded how can they be "filtered"/not rendered, it can be done on the vertex shader but the benefits are not great, I have benchmarked this). Another approach which I have considered is some WebGPU->WebGL2 communication layer, it can be done through canvases without the CPU roundtrip but its kinda of a pain because of textures etc. Regardless, with DrawIndirect a lot can be done on the GPU such as culling/dynamic LOD etc. I have started a fresh project based on this approach at Trident-WEBGPU. |
I thought more about the topic after I looked more closely at AIFanatic`s code and also looked more into drawIndirect/drawIndexedIndirect. There is For this topic, I'm thinking about a
I need to think about this more. @sunag and @RenaudRohlinger what do you think about the idea of the drawNode? |
I'm also interested to implement the indirectDraw API in the WebGPURenderer. First I will try at the Renderer level. |
Description
Hi, is there any plans to support drawIndirect and drawIndexedIndirect? I have searched the issues and the code base and could not find any references to either.
Solution
Not entirely sure what the best approach would be here but maybe provide a renderer.renderIndirect method that allows an array/buffer reference to be passed? I guess the WebGL backend would have to fallback to drawElements or equivalent.
Alternatives
I have implemented a nanite equivalent in three.js and the bottleneck now is the lod test since its done on the cpu. The algorithm is perfect for the gpu and almost everything could be implemented in WebGL but it would always require a gpu->cpu->gpu roundtrip to read how many meshlets should be displayed in order to properly call drawElements.
Additional context
No response
The text was updated successfully, but these errors were encountered: