Considering rendy and wgpu for use as nannou's graphics backend #374

freesig · 2019-08-02T05:16:45Z

Goal

Find out what is the current state of Rendy to see if it's ready for use to use as a back end. If not establish how much work is remaining.

Todo

Run some examples and see how hard this is to do.
Make a list of requirements we have currently using vulkano and see if we can match them in rendy. (nannou's vk examples serve as a good list).
Try and port some of the vulkan examples to rendy. This should help use get an idea of what is missing.
Talk to the people at Amethyst and establish some relationships so we can work together on implementing any remaining features.
Decide if we should go ahead with this project based on how much work is left vs vulkano.

mitchmindtree · 2019-08-13T13:35:00Z

Someone inquiring about rendy learning resources on reddit.

FuriouZz · 2019-09-13T17:36:12Z

What about wgpu-rs ? Too high level?

mitchmindtree · 2019-09-14T02:18:09Z

We haven't taken a close look at wgpu just yet, though we are aware of it! A couple of the reasons I'm interested in investigating rendy first:

Rendy's rendergraph is a high-level abstraction we're very interested in, to the point where we have discussed implementing something similar ourselves in the past Render Graphs - Working with Render Passes, Images, Framebuffers and Graphics Pipelines #243.
Web isn't the main priority for rendy, but it is still supported. We want to avoid getting too invested in a web-first platform as many of nannou's applications involve native installations with multiple windows across multiple displays (e.g. projection mapping with quad warping, triple screen setups, etc). It maybe very well be the case that this is possible with wgpu, but we haven't looked closely enough to be sure yet, and at least rendy seems generalised enough that it would be feasible for us to address support for this there, whereas it may be more difficult to motivate that kind of support in a render engine designed for the web first.

That said, there's a good chance both of these are also addressed by wgpu! If that's the case I'd love to hear about it - I just haven't personally had the time to investigate just yet.

kvark · 2019-09-19T20:59:07Z

I'd be interested to talk about what your requirements are. For a project focused on "creative coding", you seem to be selecting a surprisingly challenging path for the graphics stack (Vulkano, Rendy's render graph, etc). I would assume wgpu-rs to be a good fit for your needs.

Perhaps the name is confusing (the "w" part), but the project is native-first, at least as it stands now that Web is not even a target (yet). Rendering to multiple windows/swapchains is definitely in scope and should already work.

mitchmindtree · 2019-09-20T04:51:55Z

Thanks for your interest @kvark!

For a project focused on "creative coding", you seem to be selecting a surprisingly challenging path for the graphics stack (Vulkano, Rendy's render graph, etc)

Hopefully I clarified how we ended up with vulkano in #408!

W.r.t. creative coding and graphics: I guess creative coding is often associated with much higher-level tools like Processing and P5.js, though these kinds of tools only really reflect the higher-level "sketching" side of creative coding and neither lend themselves particularly well to large-scale, graphics-intensive installations or exhibitions which are often left to creative coding frameworks with lower level access like OpenFrameworks or Cinder. It's not uncommon for interactive AV installations, projection mapping features, etc to run incredibly GPU intensive custom graphics pipelines, often spread across large numbers of high resolution displays (e.g. projectors, screens, LED tiling) on limited hardware. The more control over the hardware we have, the more creative options we have so to speak. Nannou is aiming for the best of both worlds - to provide high-level, easy-to-digest tools and examples that makes sketching and learning fun, while also providing low-level access for those involved in large-scale, performance intensive installations and a path forward for those looking to take their sketches to the next level.

W.r.t. rendy and its render graph: We have been frequently finding ourselves running into an issue with vulkano where we want to re-structure our graphics pipeline to add a new renderpass or subpass in order to apply some new effect, or perhaps another output image so that we can record a sequence to file, but both of these require changing a significant amount of code each time e.g. piping through the correct colour/depth formats, moving the MSAA resolve further down the pipeline, re-specifying shader inputs, updating the previous and following passes if necessary, etc. Things get even trickier when enabling/disabling certain renderpasses at runtime - while vulkano does offer a render pass description abstraction, it only offers a compile-time-checked implementation and does not offer a run-time checked one, meaning we need to add the implementation ourselves (something that is still a W.I.P) or alternatively declare a large number of renderpasses that are almost-but-not-quite identical and switch between them. All of this lead us to the desire for some sort of higher-level graph abstraction that would automatically "compile down" into an optimal set of render passes and sub-passes at runtime. You can find some discussion around determining a solution for this at #243, before we came across rendy and its seemingly similar set of goals. We're not yet sure whether rendy ticks all of these goals yet and this is a part of why we wish to dig deeper. There's a good chance the actual render graph implemented by rendy has a totally different purpose than what I'm imagining!

Rendering to multiple windows/swapchains is definitely in scope and should already work.

That's great to hear! Being able to kick off multi-window setups easily is an important and common case for us, it's nice to know that this hasn't been abstracted away (e.g. in favour of a smoother single-window experience).

Perhaps the name is confusing (the "w" part), but the project is native-first, at least as it stands now that Web is not even a target (yet).

Yes, this is especially surprising considering the name :) The name and what it implies has admittedly been one of the biggest factors that has made me personally weary of using wgpu. For example, I would be weary of switching graphics backends to wgpu, only to one-day run into an issue where the API became constrained to browser capabilities for example. I appreciate you clarifying this though! And reading through this blog post also provides some useful context. I'd be surprised if I am the only one initially confused by the name.

Some Questions

This leads me to wonder a little more on the differences between rendy and wgpu.

On the surface-level, it looks like rendy is aimed at being a whole suite of tools that make it easier to use gfx-hal, though I guess in a way wgpu could be described similarly?
I get the impression that wgpu is being pushed as a higher-level abstraction. In what ways are wgpu higher-level than rendy? What kinds of low-level choices have been abstracted away?
I noticed that rendy mentions it is used within the wgpu project - how is it used? What parts of rendy are exposed if any?
I wonder what the motivation for amethyst is in using rendy to roll their own graphics backend rather than using wgpu? What are the features missing from wgpu, and will we run into those in nannou? Or perhaps this is due to amethyst and rendy pre-dating the wgpu project, and maybe they would have considered wgpu had the work begun today? cc @erlend-sh would you mind sharing some thoughts on this? Also feel free to link me elsewhere if either of you have been through this discussion already!

At a glance, the wgpu examples do look really nice! I also love the idea of using a high-level API in order to save ourselves time and move more graphics-y responsibility upstream. You certainly have me considering it more closely - I'd love to hear your thoughts on some of these above questions if you get a chance :)

kvark · 2019-09-20T16:45:16Z

Thank you for detailed response! I think this discussion will be incomplete without @omni-viral. I want to start by linking to a few reddit comments, in an attempt to answer "Some questions" section:

Quote from the first linked comment:

individual API libraries like ash, d3d12-rs, metal-rs - for the lowest hard-core level available to Rust
gfx-hal for lowest portable level
Rendy for helping to solve the rough corners of gfx-hal
wgpu-rs for the lowest safe level
Engines like ggez or Amethyst for the highest level

W.r.t. creative coding and graphics

I was indeed thinking of a higher-level/sketching approach to creative coding. Running GPU intensive tasks may or may not require advanced API features, or explicit control that low-level GPU APIs provide. For example, rendering on "large numbers of high resolution displays" just means that there is a lot of pixels to process - GPU will most likely be the bottleneck even if you use OpenGL1.0 to drive it.

In contrast, the low-level API helps coordinating multiple different workloads, i.e a game like GTA needs to:

rendering tens of thousands of different objects in the world, using different shaders, inputs, geometries, etc
compute physics and AI at the same time
stream in resources constantly

If I understand correctly, this is different from "creative coding" needs, which has less pipelines, less streaming, and is generally more relying on raw GPU power.

W.r.t. rendy and its render graph:

I think the ability to create an optimal render graph is often overestimated. It's mostly helping the case, again, when there is a lot of heterogeneous work thrown at GPU (i.e. GTA case). It helps synchronizing that work carefully. In a typical graphics app, there is a few render passes, and most of the work is happening within them, where there can be no barriers by definition: synchronization is only needed at the pass boundaries, and multiple queues don't magically give you more compute units to work on.

Sub-passes, for instance, shouldn't be relevant to your work for the most part. They aren't currently having any effect on desktop GPUs. So trying to build them optimally is just wasting your time (assuming you run on desktop GPUs?).

Btw, does Rendy allow you to reconfigure the graph at run-time? That seems like a great feat. (cc @omni-viral)

mitchmindtree · 2019-09-21T05:22:02Z

Thanks a lot @kvark! Those links help to clarify where rendy's role ends, where wgpu's role begins and vice versa. Your last link in particular is quite useful in sharing gfx-hal's capabilities compared to wgpu:

render sub-passes

fine-grained capabilities of texture/buffer formats, discoverable at init time

geometry and tessellation shaders

multiple queues

adapter selection and fine-grained limits

texel buffer views and combined image+samplers

mutating the same resource in different ways within the same pass

clears and blits

Is this selection of features intentionally omitted, and is the plan for them to remain omitted? Or is it more a case of "we haven't come up with a safe abstraction for providing access to these just yet, it's possible we will provide support for them if we or another contributor can come up with a safe and practical API in the future"? I realise the answer likely differs between features, but any further thoughts or details you can provide or link to on this would be useful! It would be useful for us to know where the boundaries lie in terms of what kind of contributions would be accepted given a need arises.

If I understand correctly, this is different from "creative coding" needs, which has less pipelines, less streaming, and is generally more relying on raw GPU power.

Of course, our requirements are a very long shot from what GTA needs! That said, now that you mention those requirements I'd like to clarify a couple of wgpu capabilities:

Our company's last commercial interactive installation involved a couple of compute pipelines that would run asynchronously with the graphics pipeline. The role of these compute pipelines were 1. to filter/denoise depth images from a camera so that they would be ready for use in the following frame and 2. apply some flow-field physics to a 50k+ particle system, both while the current frame was processing so that their data would be ready for rendering the next. Is it possible to do this asynchronous compute/graphics with wgpu? In your list of wgpu limitations you implied wgpu only provides access to a single queue, but I'd imagine this would be one graphics and one for compute at least?
In the same project, we could barely fit a fraction of the textures involved in GPU memory at once, so we used a single large slice sampled via a texture2darray (the textures were conveniently of uniform dimensions) for the currently visible animated particle textures and had a system for dynamically loading what was necessary between disk<->ram and ram<->gpu as required. Could we expect to run into limitations with swapping out large batches of textures like this each frame using wgpu, or is this perhaps a pretty common usecase? I'd imagine even basic games must deal with this frequently.
Is it possible to take advantage of multiple physical devices at once with wgpu?

I think the ability to create an optimal render graph is often overestimated

Honestly I think I'm personally more interested in the convenience of having the graph insert all the barriers and fences and take care of some of the renderpass tedium for me than I am interested in super-optimal performance! That said, it looks like wgpu simplifies working with renderpasses a lot already (compared to vulkano at least). On the topic of performance however, intuitively it seems like a convenient graph API might be a nice way to safely automate gaining better hardware utilisation anyway, as the API would be aware of the full flow of data rather than relying on a user to manually get all their barrier/fence synchronisation points correct (as was our experience with vulkano - one wrong barrier and you unnecessary bottleneck everything heh)? When it comes to graphs, we are mostly interested in simpler non-linear cases such as generating multiple shader inputs at once using multiple graphics pipelines, doing some compute while doing some graphics, or outputting to multiple images at once e.g. writing to the swapchain image while simultaneously writing to a another ready for a network buffer or video recording file. There's a good chance what we are looking for in a graph abstraction might be higher-level than what rendy provides!

Sub-passes, for instance, shouldn't be relevant to your work for the most part. They aren't currently having any effect on desktop GPUs. So trying to build them optimally is just wasting your time (assuming you run on desktop GPUs?).

Wow that is a surprise to read! You are correct, we have worked with desktop GPUs almost exclusively so far. Mobile is on our radar as we are getting more requests for mobile AR experiences recently, though I don't have enough experience with mobile graphics to know what we'll need there!

Thanks again for talking this out - it's very much appreciated!

kvark · 2019-09-22T01:17:27Z

@mitchmindtree

Is this selection of features intentionally omitted, and is the plan for them to remain omitted? Or is it more a case of "we haven't come up with a safe abstraction for providing access to these just yet, it's possible we will provide support for them if we or another contributor can come up with a safe and practical API in the future"?

One of the goals of WebGPU that differentiates it from the other libraries and APIs is "strong" portability, in a sense that it not only runs on all targeted platforms, but also the performance expectations match. Therefore, if don't see a way to consistently implement a feature on at least 2 or the 3 target platforms, in such a way that it works and using it shows a difference, we don't include it in the core API. Take sub-passes, for example, that, while you can technically implement them on Metal and DX12 by doing separate passes, they only make a difference on Vulkan mobile GPUs today, so there can't be expectation that using sub-passes makes you magically faster. Take buffer capabilities as another example: if we expose all the queries about what is supported and what not, it's easy to write an application that would work on the developer platform but then break on others, because they possibly forgot to query, or simply don't have another code path in place. Finally, things like "multiple queues" are desired for the most part, but we haven't yet figured a plan on how to expose them portably, as in - without race conditions affecting the portability of user programs.

Our company's last commercial interactive installation involved a couple of compute pipelines that would run asynchronously with the graphics pipeline.

Async compute pretty much requires multiple queues, which we don't have (yet) in the API.

Could we expect to run into limitations with swapping out large batches of textures like this each frame using wgpu, or is this perhaps a pretty common usecase?

CPU-GPU interaction is generally a bit less efficient in wgpu, because we have to expose it portably. Swapping out large batches of textures is definitely not a common usecase :)

Is it possible to take advantage of multiple physical devices at once with wgpu?

Yes, you can work with many logical and physical devices, but currently we offer no way for them to communicate. So that would only help you if the work is completely independent.

Given your detailed description of the work, it seems fairly clear to me that Rendy would be a better way to go. It will allow the use of multiple queues, more control over CPU-GPU transfers, and yet try to automate some of the annoying parts of Vulkan, such as barrier placement and memory allocation.

zakarumych · 2019-09-22T14:55:10Z

@kvark I mostly agree with your definition of differences between rendy and wgpu.
Except maybe that rendergraph is indeed nice abstraction for creative coding because it splits rendering job into separate modules that don't need to know about each other much, only inputs and outputs of those connected. Recording GPU commands without render-graph (or another abstraction on same or higher level) is like

fn main() {
  // writting whole application here
}

Btw, does Rendy allow you to reconfigure the graph at run-time? That seems like a great feat. (cc @omni-viral)

We figured how to do this recently. @Frizi proposed great API derived from Halcyon, but better :)

Is it possible to take advantage of multiple physical devices at once with wgpu?

Not yet possible with rendy's Factory abstraction that is used in most higher-level tools like Graph.
But no fundamental issues here. Just no one thought it would be useful for someone already. Implementation should be straightforward.
However communicating between them requires some Vulkan platform specific API not exposed in gfx-hal.

This PR is an overhaul of the graphics backend used throughout nannou. See nannou-org#446, nannou-org#374, nannou-org#408 for related issues and motivation.

freesig self-assigned this Aug 2, 2019

mitchmindtree added the graphics label Aug 8, 2019

mitchmindtree mentioned this issue Aug 10, 2019

Calling out to a compiled nannou binary from another process. #381

Closed

mitchmindtree mentioned this issue Sep 19, 2019

Add the method dispatch to AddComands for AutoCommandBufferBuilder #396

Merged

kvark mentioned this issue Sep 19, 2019

Consider partnering with gfx-portability instead of MoltenVK #408

Closed

mitchmindtree changed the title ~~Establish what the state of Rendy is~~ Considering rendy and wgpu for use as nannou's graphics backend Sep 20, 2019

mitchmindtree mentioned this issue Oct 17, 2019

Adding a new maintainer vulkano-rs/vulkano#1241

Closed

mitchmindtree mentioned this issue Nov 5, 2019

update method seems to fire at varied rates. #422

Closed

mitchmindtree mentioned this issue Nov 23, 2019

Keep in mind WASM support. Have some examples. #7

Open

mitchmindtree mentioned this issue Dec 23, 2019

Transition graphics backend from vulkano to rendy #438

Closed

25 tasks

mitchmindtree mentioned this issue Feb 16, 2020

WIP - Transition graphics backend from vulkano to wgpu #452

Merged

42 tasks

mitchmindtree closed this as completed in #452 Mar 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Considering rendy and wgpu for use as nannou's graphics backend #374

Considering rendy and wgpu for use as nannou's graphics backend #374

freesig commented Aug 2, 2019 •

edited by mitchmindtree

Loading

mitchmindtree commented Aug 13, 2019

FuriouZz commented Sep 13, 2019

mitchmindtree commented Sep 14, 2019

kvark commented Sep 19, 2019 •

edited

Loading

mitchmindtree commented Sep 20, 2019

kvark commented Sep 20, 2019

mitchmindtree commented Sep 21, 2019

kvark commented Sep 22, 2019

zakarumych commented Sep 22, 2019

Considering rendy and wgpu for use as nannou's graphics backend #374

Considering rendy and wgpu for use as nannou's graphics backend #374

Comments

freesig commented Aug 2, 2019 • edited by mitchmindtree Loading

Goal

Todo

mitchmindtree commented Aug 13, 2019

FuriouZz commented Sep 13, 2019

mitchmindtree commented Sep 14, 2019

kvark commented Sep 19, 2019 • edited Loading

mitchmindtree commented Sep 20, 2019

Some Questions

kvark commented Sep 20, 2019

mitchmindtree commented Sep 21, 2019

kvark commented Sep 22, 2019

zakarumych commented Sep 22, 2019

freesig commented Aug 2, 2019 •

edited by mitchmindtree

Loading

kvark commented Sep 19, 2019 •

edited

Loading