<- Back
Comments (162)
- bitexploderI started setting up my workflows using Temporal. It deploys as relatively light weight local app. For an isolated local installation it uses SQLite. It makes the process of dealing with API retries and organizing workflows and tasks really simple. I recommend giving it a try. It is, philosophically, exactly what this article is suggesting, but it adds an incredibly rich and flexible interface for agents to work with. Additionally, the web UI makes it very easy to inspect workflows, review agent execution, etc. Temporal also encodes much higher reliability into your system, almost for free. Distributed and reliable systems are hard, don't reinvent the wheel IMO.If you find yourself wanting things like an easy way to then introspect your SQLite database, figure out what is happening in the workflow, compose individual tasks, make workflows trivially callable, etc, give Temporal a look.Alongside this, I have mostly moved away from files for agents. Markdown and JSON are great, but also feel like traps when building out smaller local apps. LLMs are great at SQLite and you can render anything you want out of it (Markdown, JSON, etc). It saves a lot of tokens when an agent can just query a specific row instead of having to fire up jq or grep through markdown. You get a nice portable self contained data management system that encourages agents to be more disciplined about how they structure their data than a bunch of files. It also continues to scale into MySQL/Postgres if your little local projects start to outgrow or become more formal, you already have schema and discipline around data.
- levkkI don't understand this obsession with SQLite for real, production apps. SQLite is an embedded database, completely unsuitable for managing concurrency. This is what database _servers_ are for, e.g., Postgres, MySQL, etc. Their entire job is to allow you to modify data from multiple processes, on different machines, at the same time.This is a foundational principle of computer science. It seems to me that the "SQLite for everything" crowd is a little bit inexperienced.
- fathermarzExcellent write up and inspired me for our next IA design run. After reading Fly’s Litestream work it makes me think this is a solid option.
- flying_sheepCloudflare durable object is implemented with SQLite (or some variant of it)
- m2f2There's a wide gap from files to multipartition databases. Running databases in a container is not for me sorry whenever real production stuff is on the table.Personally, lots of ETL can just be taken care of locally without involving enterprise databases. In such cases, DuckDB is 5x-10x better than SQLite and orders of magnitude simpler/faster than spinning up a dedicated Postgres database.For general scripting, there's no match between a 20-lines awk script and a much cleaner, robust, maintainable equivalent SQL script based on DuckDB.I just hope MotherDuck don't need to pump/dump for IPO - it would be sad losing that tool for the usual corporate greed.
- ThaxllI started using SQLite for a home project after years of reading about it, I was shocked at the poor type system coming from Postgres. It is really inferior, not sure why it gets so much praise.https://sqlite.org/datatype3.htmlhttps://www.postgresql.org/docs/current/datatype.htmlWorking with date/time feels like using a 30years old database, nothing is enforced at insert. Really someone needs to explain why so many people like it.
- PUSH_AXI went from using the various big player postgres clusters to SQLite, we have an MAU in 7 figures, all backed by SQLite durable objects. We have to think differently about the access patterns but the benefits have been worth it.
- shukantpalSQLite is surprisingly performant for single node applications even when comparing to Postgres. Postgres consumes a lot more memory and requires IO to hop through IPC whereas you can keep everything in process in SQLite with a shared connection pool.I've been testing different storage engines for my agent harness and I can get up to 7.5k concurrent sessions on a single vCPU with SQLite whereas Postgres crashes or runs out connections.[0] https://github.com/impalasys/talon/pull/23#issuecomment-4577...
- stephenlfCan’t wait to see the next iteration of this idea with “Logs are all you need for durable workflows.”
- teravorif you have an application that needs to maintain state in a non-critical section or if you discover that using SQL is actually a good idea for some tasks (even in critical sections), SQLite is not only a good choice but it will save you a lot of time coming up with a brittle custom solution.maintain an in-memory SQLite db and work it with SQL commands, and if you also want to preserve state across application restarts you can routinely save to disk or load from it: <https://www.sqlite.org/backup.html#example_1_loading_and_sav...>this also happens to be the most convenient file-format (aka. application-format) I ever worked with.
- prmph> Postgres ... is the right choice when you need higher availability, broader shared scalability, or other deployment properties that are better served by a network database. It is also the better fit when asynchronous replication to object storage is not the durability model you want... Many workflow systems do not need that on day one and should not start with more infrastructure than their state actually demands.------I see this kind of YAGNI thinking a lot, but in my view, it must be balanced against the effort you'd put into resolving any edge cases and adapting current architecture to your use case.Imagine you deploy Sqlite, and thought it fine by itself, you keep running into some unforeseen challenges with the use to which you are putting. YOu'd need to sink valuable time and effort into addressing those. Then, when you have outgrown it, you'd beed to spend additional valuable times dping the same with Postgres.This is why, when it comes to Architecture, I increasingly find my myself over-enigneering a bit. Assuming there is a good chance you might need to upgrade your architecture in the not too distant future, that approach is actually kind of very efficient. I find that I am able to uncover a lot of potential gotchas, which feeds back into the what the simplified current architecture should be, and helps me understand the roadmap I'm facing very well. I also avoid wasting too much time going too deep in directions that make sense now, but need a lot of plumbing to get right, when I can see that I'd likely have to throw it all out in a few years. Going from A -> B -C -> D, where each step is the optimal good-enough-for-now architecture but which requires a lot of work to stabilize and iron out the kinks of, is much less efficient than exploring D well enough to know whether you should build A, B, or C now.Basically, some over-engineering, if done right, is not wasted. It cuts right to the heart of what you are dealing with, efficiently, and allows you to make (maybe) simpler but informed choices now as to how best to allocate your development resources now.
- golem14Litestream releases 5.9 and newer have a bug that causes instances to sync an insane amount of data. a DB with <10K of data in it and practically no writes/reads causes something like 10GB of daily replication traffic. For my toy project that got needlessly expensive.
- yokoprimeIf you're just doing workflows from a single node, i guess it can be ok as long as theres a single writer. But scaling across multiple servers it clearly is not all you need.
- kubik369Meta comment: This is a domain under my countries TLD (Slovakia) and it is one of the handful of words that are a word with the TLD in my language (and coincidentally) also in English. Every now and then, I will check on the domains with a retrograde dictionary for domains that have this property and root of this particular domain had a roundcube email server on it (can be checked on archive.org). After further checking, the local company actually named themselves Obeli s.r.o. (s.r.o. is Ltd), presumably so that they could use a domain that is a real word when said together with the TLD. (EDIT:) Forgot to write the thing I wanted to mention in the first place: it appears the domain must have lapsed and/or the author bought it from the company that was using it.Another fascinating fact: our countries TLD has been stolen Ocean's 11 style (I am not kidding). After Czechoslovakia split into Czech Republic and Slovak Republic, the newly created Slovak .sk TLD has been under the care of people from the local university. The university also had some offices that they were leasing out. Someone had leased this office space (EDIT: this is important as this means they had the same physical address), created a company that had the same name as the NGO that was taking care of the domain, so e.g. the NGO was named "My Company o.z." and the perpetrator created a "My Company s.r.o." (our countries version of the american Ltd). This person then wrote to ICANN to change the address to the "My Company s.r.o." presumably under the pretense that this was just an administrative error and from this point, they have functionally taken custody of the TLD. I was not able to find how they did it technically, but I presume they persuaded ICANN to then point to their servers instead of the real ones. After this happened, it seems that no one noticed for some time. When they noticed, they tried taking it back, but they weren't able to. For some inexplicable reason, the government during that time (Šuster era, early 2000s) gave the new company a contract that was functionally uncancellable from the government side. Later governments made this even more uncancellable and in 2017, then Minister of IT (and as of this day president!) Pellegrini made the contract literally uncancellable. As a result of this, we have one of the most expensive domains around (18e/year, rising each year for no good reason). (EDIT:) The company running our countries TLD is now a foreign entity that the whole thing has been sold to (multiple owners over time) and we as a country have no control over if I understand it correctly.I might have gotten some details wrong as I am writing this from my memory of researching it a couple of years back, but you get the idea, crazy stuff. Here is an article in Czech [0] that tells the story a bit better, but you have to translate it.[0] https://www.root.cz/clanky/pribeh-domeny-sk-aneb-kradez-za-b...// EDIT: I have found that the article actually links the movement to return the TLD back [1]. It also has a story tab [2], so they have something much more precise than the paraphrasing I wrote.[1] https://www.nasadomena.sk/[2] https://www.nasadomena.sk/historia/
- XcelerateHaha, I just started doing this on my own. Found it helps the agents preserve state better. I typically ask them to design a DAG first based on a set of specifications and then execute it (each step stores something in a SQLite DB). Iteration is pretty simple then because I just ask for a tweak to one or two steps of the DAG, and then to re-run.Funny how people are independently converging on similar patterns of "what works" here. Still feels like we're in the wild west with all these ad-hoc patterns of agent orchestration that people are coming up with.
- mburaksayiciAgreeing on the point, I needed NoSQL version on the similar uses, I've used TinyDB : https://mburaksayici.com/blog/2024/09/21/easy-to-use-nosql-p...
- gunnarmorlingRelated piece I wrote some time ago: https://www.morling.dev/blog/building-durable-execution-engi...
- 3dedb728-3f77Is this just a AWS ads?
- skybrianInstead of "just use Litestream," I'd like to see a review of different object stores one could use and which ones work well with Litestream. Is there a nice object store I could run in another Linux VM? As a hobbyist, which services providing an S3-like API make the most sense?
- sgloutnikovIt's close enough that DBOS does support SQLite. [0] The default for prototyping is SQLite, but sure you can run it in production if you wanted.Obligatory list of workflow engines and libraries because it's such a common need that a lot have rolled their own. [1][0] https://docs.dbos.dev/python/tutorials/database-connection[1] https://github.com/meirwah/awesome-workflow-engines
- orliesaurusSurprised no one has mentioned Turbopuffer yet [1] which natively supports dense vector similarity and BM25 keyword indexes out of the box[1]. https://turbopuffer.com/
- localhosterIdk if this article was vibe written or the author just "got adjusted" but it's clearly is, and it's unreadable. Man this becomes anmoying
- 0x59Big complex data model with ambiguous query patterns? PostgresSmall, well defined, data model with known query patterns? Bespoke modelThere probably is a place for sqlite and my project space so far hasn't yet well-aligned with it.
- lvl155And all you need is pen and paper to do calculations.
- netikUntil you scale past one machine…
- nodesocketThe biggest annoyance about SQLite for me is no ability to: ALTER TABLE users MODIFY COLUMN… ALTER TABLE users ALTER COLUMN… ALTER TABLE users ADD CONSTRAINT… You have to create a new temporary table with correct schema, copy data into this new table, drop the old table, and then rename the temporary table.
- bze12Isn’t this very similar to cloudflare durable objects & workflows?
- ChrisArchitectRelated:Building durable workflows on Postgreshttps://news.ycombinator.com/item?id=48313530
- EGregFiles is all you need.https://xkcd.com/378/
- orf> The caveat is that Litestream replication is asynchronous. A restore can miss the newest local writes if the SQLite volume disappears before they are copied. That is fine for many AI and experimentation workflowsIn short: SQLite is not all you need, unless you’re just experimenting don’t actually care about durability, in which case you also need litestream + object storage.Right.
- opiniateddev[flagged]
- tutamon[dead]
- madbo1[flagged]
- CoderAshton[dead]
- steveharing1[dead]