FUSE is All You Need – Giving agents access to anything via filesystems

<- Back

FUSE is All You Need – Giving agents access to anything via filesystems

jakobem

Comments (49)

heavyset_go
IMO, for real file systems, just give a view via cgroups/namespaces.Implementing a database abstraction as a file system for an LLM feels like an extra layer of indirection for indirection's sake: just have the LLM write some views/queries/stored procs and give it sane access permissions.LLMs are smart enough to use databases, email, etc without needing a FUSE layer to do so, and permissions/views/etc will keep it from doing or seeing stuff it shouldn't. You'll be keeping access and permissions where they belong, and not in a FUSE layer, and you won't have to maintain a weird abstraction that's annoying/hampered with licensing issues if you want to deploy it cross platform.Also, your simplified FUSE abstraction will not map accurately to the state of the world unless you're really comprehensive with your implementation, and at that point, you might as well be interacting directly in order to handle that state accurately.
_pdp_
We have also attempted to implement exactly this but it turned out to be really bad architecture.The file system as an abstraction is actually not that good at all beyond the basic use-cases. Imagine you need to find an email. If you grep (via fuse) you will end up opening lots of files which will result in fetches to some API and it will be slow. You can optimise this and caching works after first fetch but the method is slow. The alternative is to leverage the existing API which will be million times faster. Now you could also create some kind of special file via fuse that acts like a search but it is weird and I don't think the models will do well with something so obscure.We went as much as implementing this idea in rust to really test it out and ultimately it was ditched because, well it sucks.
tombert
I've been getting into FUSE a bit lately, as I stole an idea that a friend had of how to add CoW features to an existing non-CoW filesystem, so I've been on/off been hacking on a FUSE driver for ext4 to do that.To learn FUSE, however, I started just making everything into filesystems that I could mount. I wrote a FUSE driver for Cassandra, I wrote a FUSE driver for CouchDB, I wrote a FUSE driver for a thing that just wrote JSON files with Base64 encoding.None of these performed very well and I'm sort of embarrassed at how terrible the code is hence why I haven't published them (and they were also just learning projects), but I did find FUSE to be extremely fun and easy to write against. I encourage everyone to play with it.
fleshmonad
Welcome back Plan9
rescrv
For my LLM/agents framework, I went with a virtual filesystem abstraction and implemented helpers for mounting and permissioning the virtual file systems: https://github.com/rescrv/claudius/blob/main/src/agent.rs#L1...
eru
See my own https://github.com/matthiasgoergens/git-snap-fs which lets you expose all branches and all tags and all commits and all everything in a git repository as a static directory tree. No need to git checkout anything: everything is already checked out.
disdi89
Do you know how does it impact the RAG usecase? Do I not need those vector databases anymore if I instead use this FUSE layer?
mickael-kerjean
> My prediction is that one of the many sandbox providers will come up with a nice API on top of this that lets you do something like ... No worrying about FUSE, the sandbox, where things are executed, etc. This will be a huge differentiator and make virtual filesystems easily accessible to everyone.I've done exactly that with Filestash [1] using its virtual filesystem plugin [2], which exposes arbitrary systems as a filesystem. It turns out the filesystem abstraction works extremely well even for systems that are not filesystems at all. There are connector for literally every possible storage (SFTP, S3, GDrive, Dropbox, FTP, Sharepoint, GCP, Azure Cloud, IPFS....), but also things like MySQL and Postgres (where the first level folder represent the list of databases, the second level is tables that belong to a database, and each row is represented as a form file generated from the schema), LDAP (where tree nodes are represented as folders and leaf are form files), ....The whole filesystem is available to agents via MCP [3] and has been published to the OpenAI marketplace since around Christmas, currently pending review.ref:[1]: https://github.com/mickael-kerjean/filestash[2]: https://www.filestash.app/docs/guide/virtual-filesystem.html[3]: https://www.filestash.app/docs/guide/mcp-gateway.html https://github.com/mickael-kerjean/filestash/tree/master/ser...
ohnoesjmr
Why not just MCP? Feels like easier to implement and doesn't need a filesystem/root/admin perms?
everlier
I've implemented agentic framework exactly like this for my current employer.It opens up absolutely bonkers capabilities.
Eikon
For ZeroFS [0], I went an alternate route with NFS/9P. I am surprised that it’s not more common as this approach has various advantages [1] while being much more workable than fuse.[0] https://github.com/Barre/ZeroFS[1] https://github.com/Barre/ZeroFS?tab=readme-ov-file#why-nfs-a...
moonlet
I am so sick of the ‘sandboxed’ AI-infra meme. A container is not a sandbox. A chroot is not a sandbox. A VM is also not a sandbox. A filesystem is also also not a sandbox. You can sandbox an application, you can run an application in a secure context, but this is not a secure context the author is describing, firstly, and secondly they haven’t described any techniques for sandboxing unless that part of the page didn’t load for me somehow.