<- Back
Comments (27)
- luizfelbertiA bit dated in the sense that for Linux you'd probably use io_uring nowadays, but otherwise it's a timeless designStill, I'm conflicted on whether separating stages per thread (accept on one thread and the client loop in another) is a good idea. It sounds like the gains would be minimal or non-existent even in ideal circumstances, and on some workloads where there's not a lot of clients or connection churn it would waste an entire core for handling a low-volume event.I'm open to contrarian opinions on this though, maybe I'm not seeing soemthing...
- lmzSeems similar to the SEDA architecture https://en.wikipedia.org/wiki/Staged_event-driven_architectu...
- kogusSlightly tangential, but why is the first diagram duplicated at .1 opacity?
- ratrocketdiscussed in 2016: https://news.ycombinator.com/item?id=10872209 (53 comments)
- bee_rider> One thread per core, pinned (affinity) to separate CPUs, each with their own epoll/kqueue fd> Each major state transition (accept, reader) is handled by a separate thread, and transitioning one client from one state to another involves passing the file descriptor to the epoll/kqueue fd of the other thread.So this seems like a little pipeline that all of the requests go through, right? For somebody who doesn’t do server stuff, is there a general idea of how many stages a typical server might be able to implement? And does it create a load-balancing problem? I’d expect some stages to be quite cheap…
- password4321Always interesting to review the latest techempower web framework benchmarks, though it's been a year:https://www.techempower.com/benchmarks/#section=data-r23&tes...
- rot13maxii havent seen an sdf1.org url in a looooong time. lovely to see its still around
- fao_this is more or less, in some way, what Erlang does and how Erlang is so easy to scale.
- epicprogrammerIt’s an interesting throwback to SEDA, but physically passing file descriptors between different cores as a connection changes state is usually a performance killer on modern hardware. While it sounds elegant on a whiteboard to have a dedicated 'accept' core and a 'read' core, you end up trading a slightly simpler state machine for massive L1/L2 cache thrashing. Every time you hand off that connection, you immediately invalidate the buffers and TCP state you just built up. There’s a reason the industry largely settled on shared-nothing architectures like NGINX having a single pinned thread handle the entire lifecycle of a request keeps all that data strictly local to the CPU cache. When you're trying to scale, respecting data locality almost always beats pipeline cleanliness.