Need help?
<- Back

Comments (235)

  • ninkendo
    Related:I’ve always found it crazy that my LLM has access to such terrible tools compared to mine.It’s left with grepping for function signatures, sending diffs for patching, and running `cat` to read all the code at once.I however, run an IDE and can run a simple refactoring tool to add a parameter to a function, I can “follow symbol” to see where something is defined, I can click and get all usages of a function shown at a glance, etc etc.Is anyone working on making it so LLM’s get better tools for actually writing/refactoring code? Or is there some “bitter lesson”-like thing that says effort is always better spent just increasing the context size and slurping up all the code at once?
  • Jaysobel
  • rashidae
    > As a mirror to real-world agent design: the limiting factor for general-purpose agents is the legibility of their environments, and the strength of their interfaces. For this reason, we prefer to think of agents as automating diligence, rather than intelligence, for operational challenges.
  • hk__2
    > The only other notable setback was an accidental use of the word "revert" which Codex took literally, and ran git revert on a file where 1-2 hours of progress had been accumulating.
  • lukebechtel
    > We don't know any C++ at all, and we vibe-coded the entire project over a few weeks. The core pieces of the build are…what a world!
  • pocketarc
    I love the interview at the end of the video. The kubectl-inspired CLI, and the feedback for improvements from Claude, as well as the alerts/segmentation feedback.You could take those, make the tools better, and repeat the experience, and I'd love to see how much better the run would go.I keep thinking about that when it comes to things like this - the Pokemon thing as well. The quality of the tooling around the AI is only going to become more and more impactful as time goes on. The more you can deterministically figure out on behalf of the AI to provide it with accurate ways of seeing and doing things, the better.Ditto for humans, of course, that's the great thing about optimizing for AI. It's really just "if a human was using this, what would they need"? Think about it: The whole thing with the paths not being properly connected, a human would have to sit down and really think about it, draw/sketch the layout to visualize and understand what coordinates to do things in. And if you couldn't do that, you too would probably struggle for a while. But if the tool provided you with enough context to understand that a path wasn't connected properly and why, you'd be fine.
  • fnordpiglet
    Interesting article but it doesn’t actually discuss how well it performs at playing the game. There is in fact a 1.5 hour YouTube video but it woulda been nice for a bit of an outcome postmortem. It’s like “here’s the methods and set up section of a research paper but for the conclusion you need to watch this movie and make your own judgements!”
  • nipponese
    > kept the context above the ~60% remaining level where coding models perform at their absolute bestMaybe this is obvious to Claude users but how do you know your remaining context level? There is UI for this?
  • margorczynski
    I think something like Civilization would be better because:1) The map is a grid2) Turn based
  • karanveer
    the beauty of this game was that it was developed in Assembly Code and on top of that by majorly one person.I've been trying to locate the dev of this game since a long time, so I can thank them for an amazing experience.If anyone knows their social or anything, please do share, including OP.Also, nice work on CC in this. May actually be interested in Claude Code now.
  • TaupeRanger
    I corroborate that spatial reasoning is a challenge still. In this case, it's the complexity of the game world, but anyone who has used Codex/Claude with complex UIs in CSS or a native UI library will recognize the shortcomings fairly quickly.
  • haunter
    This is what I want but for PoE/PoE2 builds. I always get a headache just looking at the passive tree https://poe.ninja/poe2/passive-skill-tree
  • phreeza
    Claude Code in dwarf fortress would be wild
  • maxall4
    > In this article we'll tell you why we decided to put Claude Code into RollerCoaster Tycoon, and what lessons it taught us about B2B SaaS.What is this? A LinkedIn post?
  • khoury
    Can't wait for someone to let Claude control a runescape character from scratch
  • equinumerous
    This is a cool idea. I wanted to do something like this by adding a Lua API to OpenRCT2 that allows you to manipulate and inspect the game world. Then, you could either provide an LLM agent the ability to write and run scripts in the game, or program a more classic AI using the Lua API. This AI would probably perform much better than an LLM - but an interesting experiment nonetheless to see how a language model can fare in a task it was not trained to do.
  • vermilingua
    I want to get off MR ALTMANS WILD RIDE.
  • kinduff
    It's been several times that I see ASCII being used initially for these kinds of problems. I think it's because its counter-intuitive, in the sense that for us humans ASCII is text but we tend to forget spacial awareness.I find this very interesting of us humans interacting with AIs.
  • mentos
    The opening paragraph I thought was the agent prompt haha> The park rating is climbing. Your flagship coaster is printing money. Guests are happy, for now. But you know what's coming: the inevitable cascade of breakdowns, the trash piling up by the exits, the queue times spiraling out of control.
  • ddtaylor
    Does this website do anything besides host the article with an animated background?
  • petcat
    Question: There is still a competitive AoE2 community. Will that be destroyed by AI?
  • js4ever
    Most interesting phrase: "Keeping all four agents busy took a lot of mental bandwidth."
  • neom
    Wonder how it would do with Myst.
  • sriram_sun
    > "Where Claude excels:"Am I reading a Claude generated summary here?
  • skybrian
    Would a way to take screenshots help? It seems to work for browser testing.
  • rnmmrnm
    this is cute but i imagined prompting the ai for a loop-di-loop roller coaster. If this could build complex ride it would be a game changer.
  • azhenley
    Edit: HN's auto-resubmit in action, ignore.
  • colesantiago
    > We don't know any C++ at all, and we vibe-coded the entire project over a few weeks.And these are the same people that put countless engineers through gauntlets of bizarre interview questions and exotic puzzles to hire engineers.But when it comes to C++ just vibe it obviously.
  • fuzzy_lumpkins
    so the janitors will finally stay on their assigned footpaths?
  • deadbabe
    While this seems cool at first, it does not demonstrate superiority over a true custom built AI for rollercoaster tycoon.It is a curiosity, good for headlines, but the takeaway is if you really need an actual good AI, you are still better off not using an LLM powered solution.
  • joshcsimmons
    Interesting this is on the ramp.com domain? I'm surprised in this tech market they can pay devs to hack on Rollercoaster Tycoon. Maybe there's some crossover I'm missing but seems like a sweet gig honestly.
  • HelloUsername
    *OpenRCT2
  • sodafountan
    This was an interesting application of AI, but I don't really think this is what LLMs excel at. Correct me if I'm wrong.It was interesting that the poster vibe-coded (I'm assuming) the CTL from scratch; Claude was probably pretty good at doing that, and that task could likely have been completed in an afternoon.Pairing the CTL with the CLI makes sense, as that's the only way to gain feedback from the game. Claude can't easily do spatial recognition (yet).A project like this would entirely depend on the game being open source. I've seen some very impressive applications of AI online with closed-source games and entire algorithms dedicated to visual reasoning.I'm still trying to figure out how this guy: https://www.youtube.com/watch?v=Doec5gxhT_UWas able to have AI learn to play Mario Kart nearly perfectly. I find his work to be very impressive.I guess because RCT2 is more data-driven than visually challenging, this solution works well, but having an LLM try to play a racing game sounds like it would be disastrous.
  • nacozarina
    next up: Crusader Kings III
  • huflungdung
    [dead]
  • Kapura
    "i vibe coded a thing to play video games for me"i enjoy playing video games my own self. separately, i enjoy writing code for video games. i don't need ai for either of these things.