Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Considering rendy and wgpu for use as nannou's graphics backend #374

Closed
5 tasks done
freesig opened this issue Aug 2, 2019 · 9 comments · Fixed by #452
Closed
5 tasks done

Considering rendy and wgpu for use as nannou's graphics backend #374

freesig opened this issue Aug 2, 2019 · 9 comments · Fixed by #452
Assignees
Labels

Comments

@freesig
Copy link
Collaborator

freesig commented Aug 2, 2019

Goal

Find out what is the current state of Rendy to see if it's ready for use to use as a back end. If not establish how much work is remaining.

Todo

  • Run some examples and see how hard this is to do.
  • Make a list of requirements we have currently using vulkano and see if we can match them in rendy. (nannou's vk examples serve as a good list).
  • Try and port some of the vulkan examples to rendy. This should help use get an idea of what is missing.
  • Talk to the people at Amethyst and establish some relationships so we can work together on implementing any remaining features.
  • Decide if we should go ahead with this project based on how much work is left vs vulkano.
@mitchmindtree
Copy link
Member

Someone inquiring about rendy learning resources on reddit.

@FuriouZz
Copy link

What about wgpu-rs ? Too high level?

@mitchmindtree
Copy link
Member

We haven't taken a close look at wgpu just yet, though we are aware of it! A couple of the reasons I'm interested in investigating rendy first:

  1. Rendy's rendergraph is a high-level abstraction we're very interested in, to the point where we have discussed implementing something similar ourselves in the past Render Graphs - Working with Render Passes, Images, Framebuffers and Graphics Pipelines #243.
  2. Web isn't the main priority for rendy, but it is still supported. We want to avoid getting too invested in a web-first platform as many of nannou's applications involve native installations with multiple windows across multiple displays (e.g. projection mapping with quad warping, triple screen setups, etc). It maybe very well be the case that this is possible with wgpu, but we haven't looked closely enough to be sure yet, and at least rendy seems generalised enough that it would be feasible for us to address support for this there, whereas it may be more difficult to motivate that kind of support in a render engine designed for the web first.

That said, there's a good chance both of these are also addressed by wgpu! If that's the case I'd love to hear about it - I just haven't personally had the time to investigate just yet.

@kvark
Copy link

kvark commented Sep 19, 2019

I'd be interested to talk about what your requirements are. For a project focused on "creative coding", you seem to be selecting a surprisingly challenging path for the graphics stack (Vulkano, Rendy's render graph, etc). I would assume wgpu-rs to be a good fit for your needs.

Perhaps the name is confusing (the "w" part), but the project is native-first, at least as it stands now that Web is not even a target (yet). Rendering to multiple windows/swapchains is definitely in scope and should already work.

@mitchmindtree
Copy link
Member

Thanks for your interest @kvark!

For a project focused on "creative coding", you seem to be selecting a surprisingly challenging path for the graphics stack (Vulkano, Rendy's render graph, etc)

Hopefully I clarified how we ended up with vulkano in #408!

W.r.t. creative coding and graphics: I guess creative coding is often associated with much higher-level tools like Processing and P5.js, though these kinds of tools only really reflect the higher-level "sketching" side of creative coding and neither lend themselves particularly well to large-scale, graphics-intensive installations or exhibitions which are often left to creative coding frameworks with lower level access like OpenFrameworks or Cinder. It's not uncommon for interactive AV installations, projection mapping features, etc to run incredibly GPU intensive custom graphics pipelines, often spread across large numbers of high resolution displays (e.g. projectors, screens, LED tiling) on limited hardware. The more control over the hardware we have, the more creative options we have so to speak. Nannou is aiming for the best of both worlds - to provide high-level, easy-to-digest tools and examples that makes sketching and learning fun, while also providing low-level access for those involved in large-scale, performance intensive installations and a path forward for those looking to take their sketches to the next level.

W.r.t. rendy and its render graph: We have been frequently finding ourselves running into an issue with vulkano where we want to re-structure our graphics pipeline to add a new renderpass or subpass in order to apply some new effect, or perhaps another output image so that we can record a sequence to file, but both of these require changing a significant amount of code each time e.g. piping through the correct colour/depth formats, moving the MSAA resolve further down the pipeline, re-specifying shader inputs, updating the previous and following passes if necessary, etc. Things get even trickier when enabling/disabling certain renderpasses at runtime - while vulkano does offer a render pass description abstraction, it only offers a compile-time-checked implementation and does not offer a run-time checked one, meaning we need to add the implementation ourselves (something that is still a W.I.P) or alternatively declare a large number of renderpasses that are almost-but-not-quite identical and switch between them. All of this lead us to the desire for some sort of higher-level graph abstraction that would automatically "compile down" into an optimal set of render passes and sub-passes at runtime. You can find some discussion around determining a solution for this at #243, before we came across rendy and its seemingly similar set of goals. We're not yet sure whether rendy ticks all of these goals yet and this is a part of why we wish to dig deeper. There's a good chance the actual render graph implemented by rendy has a totally different purpose than what I'm imagining!


Rendering to multiple windows/swapchains is definitely in scope and should already work.

That's great to hear! Being able to kick off multi-window setups easily is an important and common case for us, it's nice to know that this hasn't been abstracted away (e.g. in favour of a smoother single-window experience).

Perhaps the name is confusing (the "w" part), but the project is native-first, at least as it stands now that Web is not even a target (yet).

Yes, this is especially surprising considering the name :) The name and what it implies has admittedly been one of the biggest factors that has made me personally weary of using wgpu. For example, I would be weary of switching graphics backends to wgpu, only to one-day run into an issue where the API became constrained to browser capabilities for example. I appreciate you clarifying this though! And reading through this blog post also provides some useful context. I'd be surprised if I am the only one initially confused by the name.

Some Questions

This leads me to wonder a little more on the differences between rendy and wgpu.

  • On the surface-level, it looks like rendy is aimed at being a whole suite of tools that make it easier to use gfx-hal, though I guess in a way wgpu could be described similarly?
  • I get the impression that wgpu is being pushed as a higher-level abstraction. In what ways are wgpu higher-level than rendy? What kinds of low-level choices have been abstracted away?
  • I noticed that rendy mentions it is used within the wgpu project - how is it used? What parts of rendy are exposed if any?
  • I wonder what the motivation for amethyst is in using rendy to roll their own graphics backend rather than using wgpu? What are the features missing from wgpu, and will we run into those in nannou? Or perhaps this is due to amethyst and rendy pre-dating the wgpu project, and maybe they would have considered wgpu had the work begun today? cc @erlend-sh would you mind sharing some thoughts on this? Also feel free to link me elsewhere if either of you have been through this discussion already!

At a glance, the wgpu examples do look really nice! I also love the idea of using a high-level API in order to save ourselves time and move more graphics-y responsibility upstream. You certainly have me considering it more closely - I'd love to hear your thoughts on some of these above questions if you get a chance :)

@mitchmindtree mitchmindtree changed the title Establish what the state of Rendy is Considering rendy and wgpu for use as nannou's graphics backend Sep 20, 2019
@kvark
Copy link

kvark commented Sep 20, 2019

Thank you for detailed response! I think this discussion will be incomplete without @omni-viral. I want to start by linking to a few reddit comments, in an attempt to answer "Some questions" section:

Quote from the first linked comment:

  1. individual API libraries like ash, d3d12-rs, metal-rs - for the lowest hard-core level available to Rust
  2. gfx-hal for lowest portable level
  3. Rendy for helping to solve the rough corners of gfx-hal
  4. wgpu-rs for the lowest safe level
  5. Engines like ggez or Amethyst for the highest level

W.r.t. creative coding and graphics

I was indeed thinking of a higher-level/sketching approach to creative coding. Running GPU intensive tasks may or may not require advanced API features, or explicit control that low-level GPU APIs provide. For example, rendering on "large numbers of high resolution displays" just means that there is a lot of pixels to process - GPU will most likely be the bottleneck even if you use OpenGL1.0 to drive it.

In contrast, the low-level API helps coordinating multiple different workloads, i.e a game like GTA needs to:

  • rendering tens of thousands of different objects in the world, using different shaders, inputs, geometries, etc
  • compute physics and AI at the same time
  • stream in resources constantly

If I understand correctly, this is different from "creative coding" needs, which has less pipelines, less streaming, and is generally more relying on raw GPU power.

W.r.t. rendy and its render graph:

I think the ability to create an optimal render graph is often overestimated. It's mostly helping the case, again, when there is a lot of heterogeneous work thrown at GPU (i.e. GTA case). It helps synchronizing that work carefully. In a typical graphics app, there is a few render passes, and most of the work is happening within them, where there can be no barriers by definition: synchronization is only needed at the pass boundaries, and multiple queues don't magically give you more compute units to work on.

Sub-passes, for instance, shouldn't be relevant to your work for the most part. They aren't currently having any effect on desktop GPUs. So trying to build them optimally is just wasting your time (assuming you run on desktop GPUs?).

Btw, does Rendy allow you to reconfigure the graph at run-time? That seems like a great feat. (cc @omni-viral)

@mitchmindtree
Copy link
Member

Thanks a lot @kvark! Those links help to clarify where rendy's role ends, where wgpu's role begins and vice versa. Your last link in particular is quite useful in sharing gfx-hal's capabilities compared to wgpu:

  • render sub-passes
  • fine-grained capabilities of texture/buffer formats, discoverable at init time
  • geometry and tessellation shaders
  • multiple queues
  • adapter selection and fine-grained limits
  • texel buffer views and combined image+samplers
  • mutating the same resource in different ways within the same pass
  • clears and blits

Is this selection of features intentionally omitted, and is the plan for them to remain omitted? Or is it more a case of "we haven't come up with a safe abstraction for providing access to these just yet, it's possible we will provide support for them if we or another contributor can come up with a safe and practical API in the future"? I realise the answer likely differs between features, but any further thoughts or details you can provide or link to on this would be useful! It would be useful for us to know where the boundaries lie in terms of what kind of contributions would be accepted given a need arises.

If I understand correctly, this is different from "creative coding" needs, which has less pipelines, less streaming, and is generally more relying on raw GPU power.

Of course, our requirements are a very long shot from what GTA needs! That said, now that you mention those requirements I'd like to clarify a couple of wgpu capabilities:

  • Our company's last commercial interactive installation involved a couple of compute pipelines that would run asynchronously with the graphics pipeline. The role of these compute pipelines were 1. to filter/denoise depth images from a camera so that they would be ready for use in the following frame and 2. apply some flow-field physics to a 50k+ particle system, both while the current frame was processing so that their data would be ready for rendering the next. Is it possible to do this asynchronous compute/graphics with wgpu? In your list of wgpu limitations you implied wgpu only provides access to a single queue, but I'd imagine this would be one graphics and one for compute at least?
  • In the same project, we could barely fit a fraction of the textures involved in GPU memory at once, so we used a single large slice sampled via a texture2darray (the textures were conveniently of uniform dimensions) for the currently visible animated particle textures and had a system for dynamically loading what was necessary between disk<->ram and ram<->gpu as required. Could we expect to run into limitations with swapping out large batches of textures like this each frame using wgpu, or is this perhaps a pretty common usecase? I'd imagine even basic games must deal with this frequently.
  • Is it possible to take advantage of multiple physical devices at once with wgpu?

I think the ability to create an optimal render graph is often overestimated

Honestly I think I'm personally more interested in the convenience of having the graph insert all the barriers and fences and take care of some of the renderpass tedium for me than I am interested in super-optimal performance! That said, it looks like wgpu simplifies working with renderpasses a lot already (compared to vulkano at least). On the topic of performance however, intuitively it seems like a convenient graph API might be a nice way to safely automate gaining better hardware utilisation anyway, as the API would be aware of the full flow of data rather than relying on a user to manually get all their barrier/fence synchronisation points correct (as was our experience with vulkano - one wrong barrier and you unnecessary bottleneck everything heh)? When it comes to graphs, we are mostly interested in simpler non-linear cases such as generating multiple shader inputs at once using multiple graphics pipelines, doing some compute while doing some graphics, or outputting to multiple images at once e.g. writing to the swapchain image while simultaneously writing to a another ready for a network buffer or video recording file. There's a good chance what we are looking for in a graph abstraction might be higher-level than what rendy provides!

Sub-passes, for instance, shouldn't be relevant to your work for the most part. They aren't currently having any effect on desktop GPUs. So trying to build them optimally is just wasting your time (assuming you run on desktop GPUs?).

Wow that is a surprise to read! You are correct, we have worked with desktop GPUs almost exclusively so far. Mobile is on our radar as we are getting more requests for mobile AR experiences recently, though I don't have enough experience with mobile graphics to know what we'll need there!

Thanks again for talking this out - it's very much appreciated!

@kvark
Copy link

kvark commented Sep 22, 2019

@mitchmindtree

Is this selection of features intentionally omitted, and is the plan for them to remain omitted? Or is it more a case of "we haven't come up with a safe abstraction for providing access to these just yet, it's possible we will provide support for them if we or another contributor can come up with a safe and practical API in the future"?

One of the goals of WebGPU that differentiates it from the other libraries and APIs is "strong" portability, in a sense that it not only runs on all targeted platforms, but also the performance expectations match. Therefore, if don't see a way to consistently implement a feature on at least 2 or the 3 target platforms, in such a way that it works and using it shows a difference, we don't include it in the core API. Take sub-passes, for example, that, while you can technically implement them on Metal and DX12 by doing separate passes, they only make a difference on Vulkan mobile GPUs today, so there can't be expectation that using sub-passes makes you magically faster. Take buffer capabilities as another example: if we expose all the queries about what is supported and what not, it's easy to write an application that would work on the developer platform but then break on others, because they possibly forgot to query, or simply don't have another code path in place. Finally, things like "multiple queues" are desired for the most part, but we haven't yet figured a plan on how to expose them portably, as in - without race conditions affecting the portability of user programs.

Our company's last commercial interactive installation involved a couple of compute pipelines that would run asynchronously with the graphics pipeline.

Async compute pretty much requires multiple queues, which we don't have (yet) in the API.

Could we expect to run into limitations with swapping out large batches of textures like this each frame using wgpu, or is this perhaps a pretty common usecase?

CPU-GPU interaction is generally a bit less efficient in wgpu, because we have to expose it portably. Swapping out large batches of textures is definitely not a common usecase :)

Is it possible to take advantage of multiple physical devices at once with wgpu?

Yes, you can work with many logical and physical devices, but currently we offer no way for them to communicate. So that would only help you if the work is completely independent.


Given your detailed description of the work, it seems fairly clear to me that Rendy would be a better way to go. It will allow the use of multiple queues, more control over CPU-GPU transfers, and yet try to automate some of the annoying parts of Vulkan, such as barrier placement and memory allocation.

@zakarumych
Copy link

@kvark I mostly agree with your definition of differences between rendy and wgpu.
Except maybe that rendergraph is indeed nice abstraction for creative coding because it splits rendering job into separate modules that don't need to know about each other much, only inputs and outputs of those connected. Recording GPU commands without render-graph (or another abstraction on same or higher level) is like

fn main() {
  // writting whole application here
}

Btw, does Rendy allow you to reconfigure the graph at run-time? That seems like a great feat. (cc @omni-viral)

We figured how to do this recently. @Frizi proposed great API derived from Halcyon, but better :)

Is it possible to take advantage of multiple physical devices at once with wgpu?

Not yet possible with rendy's Factory abstraction that is used in most higher-level tools like Graph.
But no fundamental issues here. Just no one thought it would be useful for someone already. Implementation should be straightforward.
However communicating between them requires some Vulkan platform specific API not exposed in gfx-hal.

mitchmindtree added a commit to mitchmindtree/nannou that referenced this issue Feb 16, 2020
This PR is an overhaul of the graphics backend used throughout nannou.

See nannou-org#446, nannou-org#374, nannou-org#408 for related issues and motivation.
mitchmindtree added a commit to mitchmindtree/nannou that referenced this issue Feb 16, 2020
This PR is an overhaul of the graphics backend used throughout nannou.

See nannou-org#446, nannou-org#374, nannou-org#408 for related issues and motivation.
mitchmindtree added a commit to mitchmindtree/nannou that referenced this issue Feb 18, 2020
This PR is an overhaul of the graphics backend used throughout nannou.

See nannou-org#446, nannou-org#374, nannou-org#408 for related issues and motivation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants