My Takeaways from WebGPU July 2021 Meetup

Omar Shehata
7 min readAug 24, 2021

The last WebGL / WebGPU meetup was packed with a lot of really exciting content. You can find the full 90 minute recording here. I decided to put together my takeaways here for others who want a quick glimpse of the latest updates, focusing particularly on WebGPU.

You’ll find Youtube timestamps for each talk below if you’re interested in watching the full segment (which I highly recommend!)

Advice for porting a large WebGL project to WebGPU

Jasper St. Pierre — Yacht Club Games.
Timestamp: 6:02–21:25

Jasper is the creator of the famous — a custom web rendering implementation of many famous video game levels, like Mario Kart & Half Life.

He recently ported this engine to WebGPU. This acted as a really good testbed for the spec because his engine implements so many different games that all use a wide range of rendering techniques.

  • WebGPU requires the use of uniform buffers, which is one big difference with WebGL. You can’t define and update individual uniform values anymore. You put all your uniforms into a buffer and send that all at once, which is a more efficient use of modern GPU hardware.
  • Jasper recommends porting WebGL 1 -> WebGL 2 as an intermediate step. Since WebGL 2 supports uniform buffers, and the code for binding is a little simpler.
  • Do not update uniform buffers between draw calls in WebGPU. This may be a common pattern in WebGL 2, but you want to keep GPU memory untouched as long as possible so it can execute draw calls in parallel. Instead, he created one single large buffer, containing data for all draw calls, and binding different parts of it before different draw calls.
  • Jasper kept his shaders in GLSL and converts to WGSL at runtime using using WebAssembly. WGSL stands for WebGPU Shading Language.
  • Viewport and clipspace conventions are different in WebGPU. WebGPU has 0,0 in the top left (as opposed to bottom left). And the frustum in WebGPU is 1 to 0 (as opposed to 1 to -1). This was easy to account for.
  • Consider a “draw call objects” design for your renderer. This one is more general advice: Instead of each object initiating draw calls directly, it pushes an object into a list containing info needed for the draw call. This makes it easier to handle multi pass rendering, sorting if needed for transparency, and collecting uniform data up front.

No.clip website repo:

The graphics framework behind it:

glTF Sampler Viewer

Moritz Becher — UX3D
Timestamp: 21:35–35:30


glTF Sample Viewer ( is an open source web app that‘s really useful for (1) verifying your glTF implementation is correct (2) finding clean sample implementation of any part of the glTF spec (3) easily previewing glTF models by drag & drop.

Source code:

PBR & KTX extensions for glTF

Sandra Voelker — Target
Timestamp: 35:40–54:20

This talk was really interesting to see how a lot of the PBR (Physically Based Rendering) extensions for glTF are being pushed for by companies like Target & Wayfair for eCommerce purposes.

Some of those new extensions this year are:

Since these extensions are new, not all 3D authoring programs support exporting them yet. Some tools Sandra recommended are:

There are many useful links at the end of Sandra’s slides, I wanted to call out in particular this “artist’s guide” for creating KTX materials which is really helpful for understanding how to get textures that use vastly less GPU memory and still look great:

Gestaltor ( is also an awesome glTF editor made by the same people that made the glTF viewer in the previous talk.

Using multi-draw to speed up drawing many small objects

Philip Taylor — Zea
Timestamp: 54:40–1:12:20

This talk was about using a new WebGL extension (WEBGL_multi_draw) to efficiently draw millions of small objects, which are common for 3D design files.

Normally you might need to use many draw calls to render all these different objects because they have different textures/shaders/material properties. For these scenes where number of draw calls is the bottleneck, Philip described a technique to bind all data required ahead of time, and group these hundreds of thousands of draw calls into much fewer “multi draw” calls.

  • “If instancing is drawing the same geometry many times, multi draw is drawing different geometries many times” is how Philip described it. So it’s getting the same performance benefit of instancing with more flexibility.
  • He referenced this blog post as a good source to learn about it: The road to 1 million draws
  • You essentially pack all your geometry into one huge vertex buffer which you then pass to the multi-draw call with offsets that define geometry for separate objects
  • You have a “drawId” which can be used in the shader to figure out which object is being drawn in the multi-draw call
  • It is not possible to switch/bind different uniforms across different objects, since they are all grouped into one multi-draw call
  • So they bind everything into textures, including model matrices. The drawId is used to index into these textures and get the model matrix, material properties, etc.
  • The disadvantage of this is that all these properties have to be the same precision, so this will be wasting extra memory. This is also a very non-trivial architecture, and makes it very difficult to change materials at runtime.

It was really cool to see at the end that there was a huge performance boost even when the multi-draw extension wasn’t supported. The fallback is to issue individual draw calls, but not having to re-bind textures/uniforms between draw calls was a big boost. They showed a design model with 17k parts, 2.5 million triangles, 17k draw calls, running at 36 fps on an iPad in Safari

Speeding up Unity’s WebGL export with batching techniques

Brendan Duncan — Unity
Timestamp: 1:13:00–1:30:11

Brendan shared how they were able to significantly speed up Unity’s WebGL export when drawing many different objects with unique materials. Normally you can only batch objects that share the same material/shader.

Brendan presented a technique to batch objects that have different materials by binding all data ahead of time and configuring each shader to use the right data (Unity calls this the “SRP Batcher”).

  • They initially couldn’t use this technique as-is in WebGL because it required 2 features not supported in WebGL: uniform layout locations and buffer binding points. This is what allows the shaders to reference the right data when it is all sent to the GPU ahead of time.
  • Specifically the issue is that WebGL does not use integer uniform locations. This is done as a security/sandboxing measure
  • They worked around this by creating an open source Emscripten plugin:
  • This plugin preprocesses the shaders. You can write OpenGL shaders using this feature (so they can keep the shaders as-is for Unity desktop/other platforms), the plugin will create a dictionary of all the uniform binding points, and generate GLSL code that works in WebGL with the correct bindings.

The result is a 2x improvement in the worst case scenario, for a scene with 1,600 objects, each with a different material, 4 real-time lights and 1 directional shadowmap.

I hope you found this useful! You can find me on Twitter (, I’d love to hear any thoughts.

You can sign up to be notified of the future WebGL/WebGPU meetups here: You can also participate in the WebGL developers mailing list which is an active community of people sharing awesome projects, exchanging feedback, and asking questions.



Omar Shehata

Graphics programmer working on maps. I love telling stories and it's why I do what I do, from making games, to teaching & writing.