My Takeaways from WebGPU July 2021 Meetup

7 min readAug 24, 2021

The last WebGL / WebGPU meetup was packed with a lot of really exciting content. You can find the full 90 minute recording here. I decided to put together my takeaways here for others who want a quick glimpse of the latest updates, focusing particularly on WebGPU.

You’ll find Youtube timestamps for each talk below if you’re interested in watching the full segment (which I highly recommend!)

Advice for porting a large WebGL project to WebGPU

Jasper St. Pierre — Yacht Club Games.
Timestamp: 6:02–21:25
Slides

Jasper is the creator of the famous noclip.website — a custom web rendering implementation of many famous video game levels, like Mario Kart & Half Life.

He recently ported this engine to WebGPU. This acted as a really good testbed for the spec because his engine implements so many different games that all use a wide range of rendering techniques.

WebGPU requires the use of uniform buffers, which is one big difference with WebGL. You can’t define and update individual uniform values anymore. You put all your uniforms into a buffer and send that all at once, which is a more efficient use of modern GPU hardware.
Jasper recommends porting WebGL 1 -> WebGL 2 as an intermediate step. Since WebGL 2 supports uniform buffers, and the code for binding is a little simpler.
Do not update uniform buffers between draw calls in WebGPU. This may be a common pattern in WebGL 2, but you want to keep GPU memory untouched as long as possible so it can execute draw calls in parallel. Instead, he created one single large buffer, containing data for all draw calls, and binding different parts of it before different draw calls.

Jasper kept his shaders in GLSL and converts to WGSL at runtime using https://github.com/gfx-rs/naga using WebAssembly. WGSL stands for WebGPU Shading Language.
Viewport and clipspace conventions are different in WebGPU. WebGPU has 0,0 in the top left (as opposed to bottom left). And the frustum in WebGPU is 1 to 0 (as opposed to 1 to -1). This was easy to account for.
Consider a “draw call objects” design for your renderer. This one is more general advice: Instead of each object initiating draw calls directly, it pushes an object into a list containing info needed for the draw call. This makes it easier to handle multi pass rendering, sorting if needed for transparency, and collecting uniform data up front.

No.clip website repo: https://github.com/magcius/noclip.website

The graphics framework behind it: https://github.com/magcius/gfxrlz

glTF Sampler Viewer

Moritz Becher — UX3D
Timestamp: 21:35–35:30
Slides

glTF Sample Viewer (https://github.khronos.org/glTF-Sample-Viewer-Release/) is an open source web app that‘s really useful for (1) verifying your glTF implementation is correct (2) finding clean sample implementation of any part of the glTF spec (3) easily previewing glTF models by drag & drop.

Source code: https://github.com/KhronosGroup/glTF-Sample-Viewer

This is a “native glTF viewer” in that it uploads glTF resources/buffers directly to the GPU, as opposed to parsing them into another intermediate format. .
Always updated with the latest spec & extensions. For example, here you can find an implementation of the new volume transmission extension, which is useful since the spec doesn’t dictate a specific implementation.

PBR & KTX extensions for glTF

Sandra Voelker — Target
Timestamp: 35:40–54:20
Slides

This talk was really interesting to see how a lot of the PBR (Physically Based Rendering) extensions for glTF are being pushed for by companies like Target & Wayfair for eCommerce purposes.

Some of those new extensions this year are:

KHR_materials_sheen — for representing the back-scattering of velvet-like materials (cloth & fabric generally).
KHR_materials_specular — for representing specular color refraction (like how different rainbow-like colors can appear when looking through glass).
KHR_materials_transmission & KHR_materials_volume — for translucent objects with non-uniform thickness like a glass jar.
KHR_texture_basisu — for texture compression with Basis Universal supercompression.

Since these extensions are new, not all 3D authoring programs support exporting them yet. Some tools Sandra recommended are:

BabylonJS exporter for 3DS Max and Maya. This is a plugin that lets you export a glTF that uses these new PBR extensions
Sandra used Blender for exporting the vase with transparent glass. See Blender glTF extensions doc.
Sandra used the new Adobe Substance 3D to export cloth materials with the sheen extension.
Sandra used the glTF VS Code plugin for adding the clear coat extension. This allows you to edit glTF material properties in realtime by editing the JSON file directly.

There are many useful links at the end of Sandra’s slides, I wanted to call out in particular this “artist’s guide” for creating KTX materials which is really helpful for understanding how to get textures that use vastly less GPU memory and still look great: https://github.com/KhronosGroup/3D-Formats-Guidelines/blob/main/KTXArtistGuide.md.

Gestaltor (https://gestaltor.io/) is also an awesome glTF editor made by the same people that made the glTF viewer in the previous talk.

Using multi-draw to speed up drawing many small objects

Philip Taylor — Zea
Timestamp: 54:40–1:12:20
Slides

This talk was about using a new WebGL extension (WEBGL_multi_draw) to efficiently draw millions of small objects, which are common for 3D design files.

Normally you might need to use many draw calls to render all these different objects because they have different textures/shaders/material properties. For these scenes where number of draw calls is the bottleneck, Philip described a technique to bind all data required ahead of time, and group these hundreds of thousands of draw calls into much fewer “multi draw” calls.

“If instancing is drawing the same geometry many times, multi draw is drawing different geometries many times” is how Philip described it. So it’s getting the same performance benefit of instancing with more flexibility.
He referenced this blog post as a good source to learn about it: The road to 1 million draws
You essentially pack all your geometry into one huge vertex buffer which you then pass to the multi-draw call with offsets that define geometry for separate objects
You have a “drawId” which can be used in the shader to figure out which object is being drawn in the multi-draw call
It is not possible to switch/bind different uniforms across different objects, since they are all grouped into one multi-draw call
So they bind everything into textures, including model matrices. The drawId is used to index into these textures and get the model matrix, material properties, etc.
The disadvantage of this is that all these properties have to be the same precision, so this will be wasting extra memory. This is also a very non-trivial architecture, and makes it very difficult to change materials at runtime.

It was really cool to see at the end that there was a huge performance boost even when the multi-draw extension wasn’t supported. The fallback is to issue individual draw calls, but not having to re-bind textures/uniforms between draw calls was a big boost. They showed a design model with 17k parts, 2.5 million triangles, 17k draw calls, running at 36 fps on an iPad in Safari

Speeding up Unity’s WebGL export with batching techniques

Brendan Duncan — Unity
Timestamp: 1:13:00–1:30:11
Slides

Brendan shared how they were able to significantly speed up Unity’s WebGL export when drawing many different objects with unique materials. Normally you can only batch objects that share the same material/shader.

Brendan presented a technique to batch objects that have different materials by binding all data ahead of time and configuring each shader to use the right data (Unity calls this the “SRP Batcher”).

They initially couldn’t use this technique as-is in WebGL because it required 2 features not supported in WebGL: uniform layout locations and buffer binding points. This is what allows the shaders to reference the right data when it is all sent to the GPU ahead of time.
Specifically the issue is that WebGL does not use integer uniform locations. This is done as a security/sandboxing measure
They worked around this by creating an open source Emscripten plugin: https://github.com/emscripten-core/emscripten/pull/13887
This plugin preprocesses the shaders. You can write OpenGL shaders using this feature (so they can keep the shaders as-is for Unity desktop/other platforms), the plugin will create a dictionary of all the uniform binding points, and generate GLSL code that works in WebGL with the correct bindings.

The result is a 2x improvement in the worst case scenario, for a scene with 1,600 objects, each with a different material, 4 real-time lights and 1 directional shadowmap.

I hope you found this useful! You can find me on Twitter (https://twitter.com/Omar4ur), I’d love to hear any thoughts.

You can sign up to be notified of the future WebGL/WebGPU meetups here: https://www.khronos.org/news/subscribe/. You can also participate in the WebGL developers mailing list which is an active community of people sharing awesome projects, exchanging feedback, and asking questions.