Let's discuss sandbox isolation

<- Back

Let's discuss sandbox isolation

shayonj

Comments (44)

simonw
I disagree with this section about WebAssembly:> But the practical limitation is language support. You cannot run arbitrary Python scripts in WASM today without compiling the Python interpreter itself to WASM along with all its C extensions. For sandboxing arbitrary code in arbitrary languages, WASM is not yet viable.There are several versions of the Python interpreter that are compiled to WASM already - Pyodide has one, and WASM is a "Tier 2" supported target for CPython: https://peps.python.org/pep-0011/#tier-2 - unofficial builds here: https://github.com/brettcannon/cpython-wasi-build/releasesLikewise I've experimented with running various JavaScript interpreters compiled to WASM, the most popular of those is probably QuickJS. Here's one of my many demos: https://tools.simonwillison.net/quickjs (I have one for MicroQuickJS too https://tools.simonwillison.net/microquickjs )So don't rule out WASM as a target for running non-compiled languages, it can work pretty well!
pash
OK, let’s survey how everybody is sandboxing their AI coding agents in early 2026.What I’ve seen suggests the most common answers are (a) “containers” and (b) “YOLO!” (maybe adding, “Please play nice, agent.”).One approach that I’m about to try is Sandvault [0] (macOS only), which uses the good old Unix user system together with some added precautions. Basically, give an agent its own unprivileged user account and interact with it via sudo, SSH, and shared directories.0. https://github.com/webcoyote/sandvault
lox
It's amazing how many different implementations of sandboxes have popped up in the past few weeks.I'm CTO at Buildkite, have been noodling on one with a view to have an environment that can run CI workloads and Agentic ones https://github.com/buildkite/cleanroom
coppsilgold
The difference between gVisor and a microVM isn't very large.gVisor can even use KVM.What gVisor doesn't have is the big Linux kernel, it attempts to roll a subset of it on its own in Go. And while doing so it allows for more convenient (from the host side) resource management.Imagine taking the Linux kernel and starting to modify it to have a guest VM mode (memory management merged with the host, sockets passed through, file systems coupled closer etc). As you progress along that axis you will eventually end up as a gVisor clone.Ultimately what all these approaches attempt to do is to narrow the interface between the jailed process as the host kernel. Because the default interface is vast. Makes you wonder if we will ever have a kernel with a narrow interface by default, a RISC-like syscall movement for kernels.
0xbadcafebee
A VM is table stakes for isolation. Nothing OS-level is going to prevent breaking out, the attack surface is too big and none of the common OSes are hardened enough. But also missing here is the firewall, which you need to prevent both data exfil and remote code execution from prompt injection. And the final part that's missing, is segregating all credentials from the agent's execution environment, which I don't think there's any existing solution for yet. Likely this will be either MCPs, or transparent proxies with policy engines that execute requests from tool calls.
burntcaramel
WebAssembly is particularly attractive for agentic coding because prompting it to write Zig or C is no harder than prompting it to write JavaScript. So you can get the authoring speed of a scripting language via LLMs but the performance close to native via wasm.This is the approach I’m using for my open source project qip that lets you pipeline wasm modules together to process text, images & data: https://github.com/royalicing/qipqip modules follow a really simple contract: there’s some input provided to the WebAssembly module, and there’s some output it produces. They can’t access fs/net/time. You can pipe in from your other CLIs though, e.g. from curl.I have example modules for markdown-to-html, bmp-to-ico (great for favicons), ical events, a basic svg rasterizer, and a static site builder. You compose them together and then can run them on the command line, in the browser, or in the provided dev server. Because the module contract is so simple they’ll work on native too.
grouchypumpkin
QubesOS was built to give sandboxes kernel isolation via a hypervisor.It’s not surprising that most people don’t know about it, because QubesOS as a daily driver can be painful. But with some improvements, I think it’s the right way to do it.
niobe
The entire kernel on every arch is 40 million lines, but the kernel running on your desktop is probably less than 2 million of those lines.
bluelightning2k
Good write up. I was hoping to see V8 isolates (Cloudflare workers) as part of the comparison at I've always found that interesting.
mcfig
I appreciate the details in this, but I also notice it is very machine-focused. When a user wants to sandbox an AI agent, they don’t just want their local .ssh keys protected. They also want to be able to control access to a lot of off-machine resources - e.g. allowing the agent to read github issues and sometimes also make some kinds of changes.
m132
> The trade-off versus gVisor is that microVMs have higher per-instance overhead but stronger, hardware-enforced isolation.Having worked on kernel and hypervisor code, I really don't see much of a difference in terms of isolation. Could you elaborate on this?
orangea
The first half of the article says "namespaces, cgroups, and seccomp aren't 'security boundaries' because if the kernel had a bug it could be used to escape from a sandbox". Then in the second half it says "use gvisor and do all this other stuff to avoid these problems." This presentation feels kind of dishonest to me because the article avoids acknowledging the obvious question: "well what if gvisor has a bug then?" I mean, sure, another layer of sandboxing that is simpler than the other layers probably increases security, but let's not pretend like these are fundamentally different approaches.
int0x29
Its worth pointing out another boundary: speculative execution. If sensitive data is in process memory with a WASM VM it can be read even if the VM doesn't expose it. This is also true of multiple WASM VMs running for different parties. For WASM isolation to work the VM needs to be in a seperate process
CuriouslyC
Sandbox isolation is only slightly important, you don't need to make it fancy, just a plain old VM. The really important thing is how you control capabilities you give for the agent to act on your behalf.
noperator
> compute isolation means nothing if the sandbox can freely phone home.Here's a project I've been working on to address the network risk. Uses nftables firewall allowing outbound traffic only to an explicit pinned domain allowlist (continuously refreshes DNS resolutions in the background).https://github.com/noperator/cagent
bigcat12345678
Unikernel/libos is relevant
anon
undefined
diacritical
[dead]
andrewmcwatters
Sharing my 5 cents on the matter: in another world, gaming, where embedding scripting languages is done for modding, I hope to see WASM take off as a way for modern modders to get into game development.I've seen smaller developers experimenting with this, but haven't heard of larger orgs doing it, possibly because UGC took the place of modders as well, and I come from an older world where what developers of my time 20 years ago would have had their hands on was an actual SDK that wasn't a part of a long microtransaction pipeline.In my org's case, where we built an entire game engine off Lua, and previously had done Lua integration in the Source Engine, I would have loved to have had sandboxing from the start rather than trying to think about security after the fact.To the article's point: even if you were to sandboxing today in those environments, I suspect you'd be faster than some of the fastest embedded scripting languages because they're just that slow.