Reflections on AI at the End of 2025

<- Back

Reflections on AI at the End of 2025

danielfalbo

Comments (245)

etra0
LLMs have certainly become extremely useful for Software Engineers, they're very convincing (and pleasers, too) and I'm still unsure about the future of our day-to-day job.But one thing that has scared me the most, is the trust of LLMs output to the general society. I believe that for software engineers it's really easy to see if it's being useful or not -- We can just run the code and see if the output is what we expected, if not, iterate it, and continue. There's still a professional looking to what it produces.On the contrary, for more day-to-day usage of the general pubic, is getting really scary. I've had multiple members of my family using AI to ask for medical advice, life advice, and stuff were I still see hallucinations daily, but at the same time they're so convincing that it's hard for them not to trust them.I still have seen fake quotes, fake investigations, fake news being spreaded by LLMs that have affected decisions (maybe, not as crucials yet but time will tell) and that's a danger that most software engineers just gross over.Accountability is a big asterisk that everyone seems to ignore
bachmeier
> Programmers resistance to AI assisted programming has lowered considerably. Even if LLMs make mistakes, the ability of LLMs to deliver useful code and hints improved to the point most skeptics started to use LLMs anyway: now the return on the investment is acceptable for many more folks.I'm not a fan of this phrasing. Use of the terms "resistance" and "skeptics" implies they were wrong. It's important we don't engage in revisionist history that allows people in the future to say "Look at the irrational fear programmers had of AI, which turned out to be wrong!" The change occurred because LLMs are useful for programming in 2025 and the earliest versions weren't for most programmers. It was the technology that changed.
ofirpress
> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.Yup, this will absolutely be a big driver of gains in AI for coding in the near future. We actually built a benchmark based on this exact principle: https://algotune.io/
dhpe
I have programmed 30K+ hours. Do LLMs make bad code: yes all the time (at the moment zero clue about good architecture). Are they still useful: yes, extremely so. The secret sauce is that you'd know exactly what to do without them.
mrdependable
These comments are a bit scary. It feels like LLMs managed to exploit some fault in the human psyche. I think the biggest danger of this technology is that people are not mentally equipped to handle it.
mwkaufma
A list of unverifiable claims, stated authoritatively. The lady doth protest too much.
danielfalbo
> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.This makes me think: I wonder if Goodhart's law[1] may apply here. I wonder if, for instance, optimizing for speed may produce code that is faster but harder to understand and extend. Should we care or would it be ok for AI to produce code that passes all tests and is faster? Would the AI become good at creating explanations for humans as a side effect?And if Goodhard's law doesn't apply, why is it? Is it because we're only doing RLVR fine-tuning on the last layers of the network so all the generality of the pre-training is not lost? And if this is the case, could this be a limitation in not being able to be creative enough to come up with move 37?[1] https://wikipedia.org/wiki/Goodhart's_law
roughly
> A few well known AI scientists believe that what happened with Transformers can happen again, and better, following different paths, and started to create teams, companies to investigate alternatives to Transformers and models with explicit symbolic representations or world models.I’m actually curious about this and would love pointers to the folks working in this area. My impression from working with LLMs is there’s definitely a “there” there with regards to intelligence - I find the work showing symbolic representation in the structure of the networks compelling - but the overall behavior of the model seems to lack a certain je ne sais quoi that makes me dubious that they can “cross the divide,” as it were. I’d love to hear from more people that, well, sais quoi, or at least have theories.
jimmydoe
> * The fundamental challenge in AI for the next 20 years is avoiding extinction.sorry, I say it's folding the laundry. with an aging population, that's the most, if not only, useful thing.
torlok
This is a bunch of "I believe" and "I think" with no sources by a random internet person.
abricq
> * Programmers resistance to AI assisted programming has lowered considerably. Even if LLMs make mistakes, the ability of LLMs to deliver useful code and hints improved to the point most skeptics started to use LLMs anyway: now the return on the investment is acceptable for many more folks.Could not agree more. I myself started 2025 being very skeptical, and finished it very convinced about the usefulness of LLMs for programming. I have also seen multiple colleagues and friends go through the same change of appreciation.I noticed that for certain task, our productivity can be multiplied by 2 to 4. So hence comes my doubts: are we going to be too many developers / software engineers ? What will happen for the rests of us ?I assume that other fields (other than software-related) should also benefits from the same productivity boosts. I wonder if our society is ready to accept that people should work less. I think the more likely continuation is that companies will either hire less, or fire more, instead of accepting to pay the same for less hours of human-work.
russfink
Practical question: when getting the AI to teach you something, eg how attention can be focused in LLMs, how do you know it’s teaching you correct theory? Can I use a metric of internal consistency, repeatedly querying it and other models with a summary of my understanding? What do you all do?
seu
> And I've vibe coded entire ephemeral apps just to find a single bug because why not - code is suddenly free, ephemeral, malleable, discardable after single use. Vibe coding will terraform software and alter job descriptions.I'm not super up-to-date on all that's happening in AI-land, but in this quote I can find something that most techno-enthusiast seem to have decided to ignore: no, code is not free. There are immense resources (energy, water, materials) that go into these data centers in order to produce this "free" code. And the material consequences are terribly damaging to thousands of people. With the further construction of data centers to feed this free video coding style, we're further destroying parts of the world. Well done, AGI loverboys.
piker
> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.Super skeptical of this claim. Yes, if I have some toy poorly optimized python example or maybe a sorting algorithm in ASM, but this won’t work in any non-trivial case. My intuition is that the LLM will spin its wheels at a local minimum the performance of which is overdetermined by millions of black-box optimizations in the interpreter or compiler signal from which is not fed back to the LLM.
pton_xd
> For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.It's interesting that Terrence Tao just released his own blog post stating that they're best viewed as stochastic generators. True he's not an AI researcher, but it does sound like he's using AI frequently with some success."viewing the current generation of such tools primarily as a stochastic generator of sometimes clever - and often useful - thoughts and outputs may be a more productive perspective when trying to use them to solve difficult problems" [0].[0] https://mathstodon.xyz/@tao/115722360006034040
fleebee
> The fundamental challenge in AI for the next 20 years is avoiding extinction.That's a weird thing to end on. Surely it's worth more than one sentence if you're serious about it? As it stands, it feels a bit like the fearmongering Big Tech CEOs use to drive up the AI stocks.If AI is really that powerful and I should care about it, I'd rather hear about it without the scare tactics.
register
Where to understand more about how chain of thoughs really affects LLMs performance? I read the seminal paper but all it says is that it's basically another prompt engineering tecnique that improves accuracy.
Fraterkes
It’s interesting that half the comments here are talking about the extinction line when, now that we’re nearly entering 2026, I feel the 2027 predictions have been shown to be pretty wrong so far.
lolz404
This article does little to support its claims but was a good primer to dive into some topics.They are cool new tools use them where you can but there is a ton of research still left to do. Just lols at the hubris silicon valley will make something so smart it extincts humankind. It'll happen from the lack of water and heated planet first :)The stocastic parrot argument is still debated but more nuanced than before. Although the original author still stands by the statement. Evidence of internal planning per model. Anthropic Attribution Graphs Research with some rhyming did support it but gemma didn't.The idea of "understanding" is still up for debate as well. Sure, when models are directly trained on data there is representation. Othello-GPT Studies was one way to support but that was during training so some interal representation was created. Out of distribution task will collapse to confabulation. Apple's GSM-Symbolic Research seems to support that.Chain of thought is a helpful tool but is untrustworthy at best. Anthropic themselves have showed this https://www.anthropic.com/research/reasoning-models-dont-say...
anon
undefined
agumonkey
There's videos about Diffusion LLMs too, apparently getting rid of the linear token generation. But I'm no ML engineer.
Aiisnotabubble
What also happens and it's irrelevant of AGI: global RLAround the world people ask an LLM and get a response.Just grouping and analysing these questions and solving them once centrally and then making the solution available again is huge.Linearly solving the most asked questions and then the next one then the next will make, whatever system is behind it, smarter every day.
a_bonobo
>* For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.Man, Antirez and I walk in very different circles! I still feel like LLMs fall over backwards once you give them an 'unusual' or 'rare' task that isn't likely to be presented in the training data.
erichocean
> 1. NOT have any representation about the meaning of the prompt.This one is bizarre, if true (I'm not convinced it is).The entire purpose of the attention mechanism in the transformer architecture is to build this representation, in many layers (conceptually: in many layers of abstraction).> 2. NOT have any representation about what they were going to say.The only place for this to go is in the model weights. More parameters means "more places to remember things", so clearly that's at least a representation.Again: who was pushing this belief? Presumably not researchers, these are fundamental properties of the transformer architecture. To the best of my knowledge, they are not disputed.> I believe [...] it is not impossible they get us to AGI even without fundamentally new paradigms appearing.Same, at least for the OpenAI AGI definition: "An AI system that is at least as intelligent as a normal human, and is able to do any economically valuable work."
rckt
> Even if LLMs make mistakes, the ability of LLMs to deliver useful code and hints improved to the point most skeptics started to use LLMs anywayHere we go again. Statements with the single source in the head of the speaker. And it’s also not true. The llms still produce bad/irrelevant code at such rate that you can spend more time prompting than doing things yourself.I’m tired of this overestimation of llms.
ur-whale
Not sure I understand the last sentence:> The fundamental challenge in AI for the next 20 years is avoiding extinction.
ctoth
> The fundamental challenge in AI for the next 20 years is avoiding extinction.So nice to see people who think about this seriously converge on this. Yes. Creating something smarter than you was always going to be a sketchy prospect.All of the folks insisting it just couldn't happen or ... well, there have just been so many objections. The goalposts have walked from one side of the field to the other, and then left the stadium, went on a trip to Europe, got lost in a beautiful little village in Norway, and decided to move there.All this time though, the prospect of instantiating a something smarter than you (and yes, it will be smarter than you even if it's at human level because of electronic speeds...) This whole idea is just cursed and we should not do the thing.
alexgotoi
> * The fundamental challenge in AI for the next 20 years is avoiding extinction.This reminded me of the Don’t look up movie where they basically gambled with the humans extinction.
gaigalas
This post is a bait for enthusiasts. I like it.> Chain of thought is now a fundamental way to improve LLM output.That kinda proves _that LLMs back then were pretty much stochastic parrots indeed_, and the skeptics were right at the time. Today, enthusiasts agree with what they previously said: without CoT, the AI feels underwhelming, repetitive and dumb and it's obvious that something more was needed.Just search past discussions about it, people were saying the problem would be solved with "larger models" (just repeating marketing stuff) and were oblivious to the possibility of other kinds of innovations.> The fundamental challenge in AI for the next 20 years is avoiding extinction.That is a low level sick burn on whoever believes AI will be economically viable short-term. And I have to agree.
bgwalter
Regarding the stochastic parrots:It is easy to see that LLMs exclusively parrot by asking them about current political topics [1], because they cannot plagiarize settled history from Wikipedia and Britannica.But of course there also is the equivalence between LLMs and Markov chains. As far as I can see, it does not rely on absurd equivalences like encoding all possible output states in an infinite Markov chain:https://arxiv.org/abs/2410.02724Then there is stochastic parrot research:https://arxiv.org/abs/2502.08946"The stochastic parrot phenomenon is present in LLMs, as they fail on our grid task but can describe and recognize the same concepts well in natural language."As said above, this is obvious to anyone who has interacted with LLMs. Most researchers know what is expected of them if they want to get funding and will not research the obvious too deeply.[1] They have Internet access of course.
bgwalter
They are very advanced stochastic parrots that allow AI invested authors to suddenly write in perfect English.If Antirez has never gotten an LLM to perform an absolutely embarrassing mistake, he must be very lucky or we should stop listening to him.Programmers' resistance has not weakened. Since the ORCL drop of 40% anti-LLM opinions are censored and downvoted here. Many people have given up, and we always get articles from the same LLM influencers.
lowsong
I'm impressed that such a short post can be so categorically incorrect.> For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots> In 2025 finally almost everybody stopped saying so.There is still no evidence that LLMs are anything beyond "stochastic parrots". There is no proof of any "understanding". This is seeing faces in clouds.> I believe improvements to RL applied to LLMs will be the next big thing in AI.With what proof or evidence? Gut feeling?> Programmers resistance to AI assisted programming has lowered considerably.Evidence is the opposite, most developers do not trust it. https://survey.stackoverflow.co/2025/ai#2-accuracy-of-ai-too...> It is likely that AGI can be reached independently with many radically different architectures.There continues to be no evidence beyond "hope" that AGI is even possible, yet alone that Transformer models are the path there.> The fundamental challenge in AI for the next 20 years is avoiding extinction.Again, nothing more than a gut feeling. Much like all the other AI hype posts this is nothing more than "well LLMs sure are impressive, people say they're not, but I think they're wrong and we will make a machine god any day now".
HellDunkel
[flagged]
feverzsj
Seems they also want some AI money[0]. Guess, I'll keep using Valkey.[0] https://redis.io/redis-for-ai/
anon
undefined