The browser has always been a sandbox. That is its great strength and, for those of us who push it beyond rendering documents, its most stubborn constraint. Browsers isolate us from the host system, and in return we get security, portability, and a remarkably consistent runtime across billions of devices. The price is that the more ambitious the workload – machine learning inference, video editing, CAD, scientific simulation – the more we have to understand the sandbox we live in. WebAssembly is the clearest example of that trade-off today, and it has quietly stopped feeling like an experiment and started feeling like infrastructure.

This article is, on the surface, about WebAssembly. Underneath, it is about something more obvious: if you want to use a powerful tool well, you have to understand the environment it runs in.

What “performance” actually mean

Most conversations about WebAssembly begin with the word performance, usually used as if it were self-explanatory. It is not. Performance is inversely proportional to the time your program takes to run, which begs the real question: what is the runtime of your program?

Consider a small JavaScript function – a perfectly valid diff of a few lines. Your CPU does not understand any of it. What the CPU understands is assembly, the textual form of the binary instructions it executes. That tiny diff function compiles down to several hundred machine instructions, and those instructions do not execute neatly one after another. They take different amounts of time, they may run out of order, and the whole choreography depends on the architecture of the chip. “How fast is my code?” is, in practice, a question about a pipeline you cannot see.

Turning source code into those instructions is the job of a compiler. The input is the source; the output is the target. Between them sits a pipeline of intermediate representations – IRs – that the compiler walks and rewrites. The parts that deal with the source language (parsing, building an AST, early analysis) are called the frontend; the parts that emit and optimize for the target are called the backend. The split is what lets you bolt a C++ frontend onto a WebAssembly backend and get a C++-to-Wasm compiler – or swap the frontend for Rust and get a Rust-to-Wasm compiler for free.

Inside those passes, the compiler does most of the work that makes modern software fast: function inlining, common subexpression elimination, constant propagation, code motion out of loops, loop unrolling, register allocation (a famously NP-complete problem), branch layout, and target-specific instruction selection. A trivial C program that computes a factorial-like product can collapse, under LLVM’s optimiser, into a single constant in the emitted IR. The loop simply disappears.

A representation designed for the web

If you were going to design an intermediate representation specifically for the web, what would it look like?

It would need to be cheap to turn into machine code, because the user is waiting while that happens. It would need to be independent of any particular CPU architecture – a virtual Instruction Set Architecture, in the same family as the JVM. It would need to be compact, because it travels over the network, which means a binary encoding rather than text. And it would need to be cheap to parse, ideally in a single pass, with enough structure that a compiler can start work before the download even finishes.

Put those constraints together and you get a stack-based virtual machine with a binary format and structured control flow. You get, in other words, WebAssembly.

A Wasm module is exactly what a web developer would hope for: it has imports and exports, a linear memory (a virtual ArrayBuffer that is later mapped to real memory), functions made of instructions, and a table for indirect references such as function pointers. Its control flow is structured – no arbitrary jumps, only blocks and labels – which is what makes single-pass parsing and compilation possible. Add two parameters with local.get 0, local.get 1, i32.add, and the instructions themselves carry no operands; the stack holds everything. The result is dense, predictable, and friendly to a compiler in a hurry.

JavaScript and WebAssembly in V8

To see why this matters, compare the lifecycle of a JavaScript file and a Wasm module inside V8, the engine that powers Chrome and the Chromium family.

JavaScript arrives, gets parsed (on a background thread, which is the source of a lot of clever pipelining), and lands in Ignition, V8’s interpreter. Ignition does two important things: it generates bytecode for a register-based VM, and it collects feedback about the types your code actually uses, because JavaScript is weakly typed and the engine internally is not. Cold functions get cheap, unoptimised code from the Sparkplug baseline compiler. Hot functions – those that have been called enough times, often inside a loop – get sent to TurboFan, which rebuilds them through its Sea of Nodes IR into something genuinely fast.

This is wonderful until the assumptions break. Call an optimised function with a string where you used to pass a number and the engine deoptimises: the fast code is thrown away and execution falls back to the slow path. The peak speed of JavaScript is excellent. The predictability of that speed is not.

WebAssembly’s story in V8 is calmer. A typical Wasm pipeline starts well before the browser: a C++ source is fed to Clang, which emits LLVM IR; the LLVM WebAssembly backend lowers that to a `.wasm` module; and a tool like Binaryen optionally squeezes it further. Emscripten orchestrates the whole flow, and it is fair to say that Emscripten is the reason WebAssembly arrived in browsers in usable shape at all – it predates Wasm by years, originally targeting Asm.js.

Inside V8, the module hits two compilers. Liftoff is a baseline compiler that streams through the bytes and emits machine code quickly, leaning on the optimisations already baked into the Wasm format. TurboFan then re-compiles selected functions in the background for higher quality code. Crucially, there are no deoptimisations: once a Wasm function is fast, it stays fast. You also get *streaming compilation* – parsing and compiling begin before the download finishes – and parallel compilation across multiple threads, because the module format was designed for it. For modules larger than 128 KB, Chromium even caches the compiled machine code, not just the source bytes.

The integration with JavaScript is deliberately boring, which is a compliment. You call WebAssembly.instantiateStreaming, hand the module a WebAssembly.Memory you control the size of, and pass in an import object whose values are ordinary JavaScript functions. From inside Wasm you can call those imports – console.log, fetch handlers, anything – and both sides can read and write the shared memory. The boundary is narrow and explicit, which is exactly what you want when you are reasoning about performance.

Where the platform stands today

For most of WebAssembly’s life, the honest summary has been “MVP, with a long roadmap.” That summary is no longer quite right.

Fixed-width SIMD – 128-bit vector instructions that let you apply one operation to many lanes of data at once – is the marquee feature for workloads like image processing, audio, and ML inference. It shipped in Chrome and Firefox and has become the de facto baseline in those engines. Safari is the conspicuous holdout in stable releases, but the feature is now implemented and riding in Safari Technology Preview 161, so broad cross-browser availability is close.

Exception handling is genuinely cross-browser. Chrome and Safari shipped it earlier; Firefox completed the set in May. That matters because the JavaScript-based workaround was both slower and larger, and it was a real obstacle for languages like C++ and C# whose runtimes lean heavily on exceptions. Toolchains responded quickly: .NET 7 added support for both exception handling and SIMD, and the Uno Platform did the same, going so far as to ship experimental WebAssembly threading.

Tail calls, useful for compiler-generated code and for languages that rely on deep recursion, sat behind a flag in Chrome for a long time because the proposal could not reach phase 4 without a second implementation. Safari has now supplied that second implementation; tail calls are in Safari Technology Preview 161 as well, clearing the path to standardisation.

Garbage collection is perhaps the most pleasant surprise. The GC proposal had been parked at phase 1 since 2017. It advanced to phase 2 in February and to phase 3 in October. Every major browser team is building support, and compiler maintainers for several managed languages are already preparing to target it. The pay-off is significant: today, languages that need a GC either ship one inside the module or compile their entire runtime to Wasm, both of which inflate download size and cold-start time.

WebAssembly 2.0. A 2.0 revision of the specification is now underway. It is less a “new WebAssembly” than a packaging exercise: it bundles the features that have been standardised since the 1.0 MVP – SIMD, bulk memory operations, reference types, multi-value returns, sign-extension operators, and others – into a single version that runtimes can claim conformance to.

Multiple memories. Allowing a module to hold more than one linear memory would let it cleanly separate internal state from buffers exposed to the outside world, with real security benefits. It is a feature I would like to see move, but no browser team appears to be actively implementing it at the moment.

Threads remain partially shipped: the underlying primitives (shared memory, atomics) have been in Chrome and Firefox for some time, but only in contexts that meet cross-origin isolation requirements. They are still experimental enough that you should treat them as an optimisation, not a foundation.

WebAssembly outside the browser

The most consequential recent development, in my view, did not happen in a browser at all.

containerd, the container runtime that sits underneath Docker and Kubernetes, gained the ability to load a Wasm shim that hands modules off to a WebAssembly runtime. The practical effect is that Wasm modules can now run alongside ordinary Linux containers on the same orchestration substrate. Docker Desktop ships an integration that lets you docker run a .wasm image; Microsoft has a preview of AKS with WASI node pools.

The appeal is straightforward. Wasm modules have no ambient access to the host: they see only what their WASI imports grant them, which is a meaningful improvement over the default container threat model. The artefacts are small, often a fraction of an equivalent OCI image, and they start in milliseconds rather than seconds – properties that matter enormously for serverless and edge workloads. And because the images live in ordinary OCI registries, the existing supply chain (Docker Hub, private registries, signing, scanning) carries over.

A related and very clever piece of work came from the Bytecode Alliance, who compiled SpiderMonkey – Firefox’s JavaScript engine – to WASI and then pre-initialised it. By executing the user’s JavaScript at build time and snapshotting the resulting bytecode straight into the module’s data section, they reported cold-start times around 0.36 milliseconds, roughly 13× faster than starting the engine from scratch. The implication is that JavaScript itself becomes embeddable in any WASI host, at speeds that make per-request isolation realistic.

Should you put WebAssembly in your project?

Yes – but think first.

If you have existing non-JavaScript code that you want to bring to the browser, WebAssembly is the right answer and has been for a while. The toolchains are mature, the integration with JavaScript is well understood, and the performance is predictable in a way the JavaScript engine cannot match.

If you are looking at WebAssembly because the benchmarks are exciting and the conference talks are persuasive, please resist. Hype-driven adoption is a poor reason to take on a second toolchain, a second runtime, and a new debugging story. Measure first. Find the part of your program that is actually a bottleneck. Optimise it in plain JavaScript until you have clearly hit the ceiling of what the engine can do. Only then is WebAssembly the right tool, and when it is, you will know exactly which function to port and why.

The MVP era is ending. Exception handling is everywhere, SIMD is nearly everywhere, garbage collection is in sight, tail calls are on their way to standardisation, and the runtime is escaping the browser into containers and edge platforms. WebAssembly rewards the developers who take the time to understand the sandbox it runs in – and increasingly, the hosts beyond it.

Author

Gleb Khmyznikov

I’m Gleb Khmyznikov, a software engineer at Microsoft working on PWABuilder, a Progressive Web Apps expert and speaker, an open-source developer, and the author of the widely used pwa-install library. I’m passionate about solving complex problems and building practical tools and libraries for the web.

View all posts Microsoft Software Development Engineer II, Belgrade, Serbia

Gleb Khmyznikov 27 October 2022

8 minutes read

What “performance” actually mean

A representation designed for the web

JavaScript and WebAssembly in V8

Where the platform stands today

WebAssembly outside the browser

Should you put WebAssembly in your project?

Author

Related Articles

The Impact of AI in Web Design on Business Growth

Building AI meeting infrastructure: what the Google Meet API enables

Best AI Translation Tools in 2026: A Practical Guide for Modern Teams

Prompt Library: The Ultimate Resource for Better AI Image Creation