<- Back
Comments (43)
- sho_hnIt does all start to feel like we'd get fairly close to being able to convincingly emulate a lot of human or at least animal behavior on top of the existing generative stack, by using brain-like orchestration patterns ... if only inference was fast enough to do much more of it.The gauge-reading example here is great, but in reality of course having the system synthesize that Python script, run the CV tasks, come back with the answer etc. is currently quite slow.Once things go much faster, you can also start to use image generation to have models extrapolate possible futures from photos they take, and then describe them back to themselves and make decisions based on that, loops like this. I think the assumption is that our brains do similar things unconsciously, before we integrate into our conscious conception of mind.I'm really curious what things we could build if we had 100x or 1000x inference throughput.
- colinatorThis seems perfect to hook up to my 'LLMs can control robots over MCP' system. The idea is that LLMs are great at writing code, so let's lean in to that. I'll give it a try! I just got a bigger robot, we'll see how it does...https://colinator.github.io/Ariel/post1.html
- harrallGoogle and Boston Dynamics (of Spot, Atlas fame) formed a partnership a while back and they’ve been working on building models together.Hyundai now owns Boston Dynamics and is pushing to get the robots into their factories.
- vibe42A parcel of land.A few robot legs and arms, big battery, off-the-shelf GPU. Solar panels.Prompt: "Take care of all this land within its limits and grow some veggies."
- skybrianPointing a camera at a pressure gauge and recording a graph is something that I would have found useful and have thought about writing. Does software like that exist that’s available to consumers?
- w10-1Would this approach destroy critical investments in physics- or modeling-based reasoning?I'm all for the task reasoning and the multi-view recognition, based on relevant points. I'm very uncomfortable with the loose world "understanding".The fault model I see is that e.g., this "visual understanding" will get things mostly right: enough to build and even deliver products. However, these are only probabilistic guarantees based on training sets, and those are unlikely to survive contact with a complex interactive world, particularly since robots are often repurposed as tasks change.So it's a kind of moral-product-hazard: it delivers initial results but delays risk to later, so product developers will have incentives to build and leave users holding the bag. (Indeed: users are responsible for integration risks anyway.)It hacks our assumptions: we think that you can take an MVP and productize it, but in this case, you'll never backfit the model to conform to the physics in a reliable way. I doubt there's any way to harness Gemini to depend on a physics model, so we'll end up with mostly-working sunk investments out in the market - slop robots so cheap that tight ones can't survive.
- gallerdudeI’ve been thinking about AI robotics lately… if internally at labs they have a GPT-2, GPT-3 “equivalent” for robotics, you can’t really release that. If a robot unloading your dishwasher breaks one of your dishes once, this is a massive failure.So there might be awesome progress behind the scenes, just not ready for the general public.
- mt_Is there a open source mini robot kit that allows me to play-around with agentic robots?
- vessenesNice. I couldn't find the part that I'm most interested in though, latency. This beats their frontier vision model for some identification tasks -- for a robotics model, I'm interested in hz. Since this is an "Embodied Reasoning" model, I'm assuming it's fairly slow - it's designed to match with on-robot faster cycle models.Anyway, cool.
- steveharing1Soon Open Source will fill the gap here as well
- anonundefined
- vipipiccf[dead]
- jeffbeeShowing the murder dog reading a gauge using $$$ worth of model time is kinda not an amazing demo. We already know how to read gauges with machine vision. We also know how to order digital gauges out of industrial catalogs for under $50.