<- Back
Comments (924)
- trostaftSpeaking as a postdoc in math, I must say that this is rather exciting. This is outside of my field, but the companion remarks document is quite digestible. It appears as though the proof here fairly inspired by results in literature, but the tweaks are non-trivial. Or, at least to me, they appear to be substantial to where I would consider the entire publication novel and exciting.Many of my colleagues and I have been experimenting with LLMs in our research process. I've had pretty great success, though fairly rarely do they solve my entire research question outright like this. Usually, I end up with a back and forth process of refinements and questions on my end until eventually the idea comes apparent. Not unlike my traditional research refinement process, just better. Of course, I don't have access to the model they're using =) .Nevertheless, one thing that struck me in this writeup, was the lack of attribution in the quoted final response from the model. In a field like math, where most research is posted publicly and is available, attribution of prior results is both social credit and how we find/build abstractions and concentrate attention. The human-edited paper naturally contains this. I dug through the chain-of-thought publication and did actually find (a few of) them. If people working on these LLMs are reading, it's very important to me that these are contained in the actual model output.One more note: the comments on articles like these on HN and otherwise are usually pretty negative / downcast. There's great reason for that, what with how these companies market themselves and how proponents of the technology conduct themselves on social media. Moreover, I personally cannot feel anything other than disgust seeing these models displace talented creatives whose work they're trained on (often to the detriment of quality). But, for scientists, I find that these tools address the problem of the exploding complexity barrier in the frontier. Every day, it grows harder and harder to contain a mental map of recent relevant progress by simple virtue of the amount being produced. I cannot help but be very optimistic about the ambition mathematicians of this era will be able to scale to. There still remain lots of problems in current era tools and their usage though.
- cpardThe proof brings unexpected, sophisticated ideas from algebraic number theory to bear on an elementary geometric question.The more I read about these achievements the more I get a feeling that a lot of the power of these models comes from having prior knowledge on every possible field and having zero problems transferring to new domains.To me the potential beauty of this is that these tools might help us break through the increasing super specialization that humans in science have to go through today. Which in one hand is important on the other hand does limit the person in terms of the tooling and inspiration it has access to.
- mooreatI think one interesting thing to point out is that the proof (disproof) was done by finding a counterexample of Erdős' original conjecture.I agree with one of the mathematician's responses in the linked PDF that this is somewhat less interesting than proving the actual conjecture was true.In my eyes proving the conjecture true requires a bit more theory crafting. You have to explain why the conjecture is correct by grounding it in a larger theory while with the counterexample the model has to just perform a more advanced form of search to find the correct construction.Obviously this search is impressive not naive and requires many steps along the way to prove connections to the counterexample, but instead of developing new deep mathematics the model is still just connecting existing ideas.Not to discount this monumental achievement. I think we're really getting somewhere! To me, and this is just vibes based, I think the models aren't far from being able to theory craft in such a way that they could prove more complicated conjectures that require developing new mathematics. I think that's just a matter of having them able to work on longer and longer time horizons.
- vatsachakAs I have stated before, AI will win a fields medal before it can manage a McDonald'sA difficult part was constructing a chess board on which to play math (Lean). Now it's just pattern recognition and computation.LLMs are just the beginning, we'll see more specialized math AI resembling StockFish soon.
- raincoleI like how everyone laughed when OpenAI said their models will have "PhD-Level Intelligence" and now the goalpost has been moved to if AI can create new math (i.e., not PhD-Level, but Leibniz/Euler/Galois level.)
- lesostepI am cautious about AI "discoveries" after Mythos paper.What was the process of a writing a paper? Was the question asked by a mathematician? Was the paper right from a get-go or was there someone who pointed out mistakes?How much attempts were made before solution was found?I will eat my words if an AI oneshotted that one without any external help, but for know I am left wandering whether it's a new way to attribute discoveries to companies instead of people who put the work in
- QuentakI'd like to know how many tokens in total went into solving this problem. Have they talked about this? It matters whether they got this result in 10 million tokens or 10 billion. Whether it's closer to 1 human working on this for 1 year or 1000 humans for 1 year. The news feels different when the probability of one AI run solving this is 1 in a thousand vs 1 a million. Approximately I'm asking about the amount of money it cost to solve it, which has to include the failed parallel runs.
- zozbot234The summarized chain of thought for this task (linked in the blogpost) is 125 pages. That's an insane scale of reasoning, quite akin to what Anthropic has been teasing with Mythos.
- recitedropperThis is impressive, no question.Without knowing all this model has been trained on though, it is pretty hard to ascertain the extent to which it arrived to this "on its own". The entire AI industry has been (not so secretly) paying a lot of experts in many fields to generate large amounts of novel training data. Novel training data that isn't found anywhere else--they hoard it--and which could actually contain original ideas.It isn't likely that someone solved this and then just put it in the training data, although I honestly wouldn't put that past OpenAI. More interesting though is the extent to which they've generated training data that may have touched on most or all of the "original" tenets found in this proof.We can't know, of course. But until these things are built in a non-clandestine manner, this question will always remain.
- ccvannormanI looked at all linked articles and could not find an example of the points (they show a square grid of points with n~=100 but no other ordering of points to show the more optimal layout(s)).Is there anywhere an image example of a superior layout for example with n>={100,1000,10000}..? I would love to see it. I am imagining it would look somewhat like a sloppy pizza.
- kevinwangNitpicky/not important, but they say:Since loglog(n) tends to infinity with n, the additional term in the exponent tends to 0, meaning these constructions achieve growth only slightly faster than linear.Would anyone else describe the previous asymptotic behavior like that? I mean obviously loglogn to O(1) is a quantum leap, but wouldn't you describe loglogn as "grows so slowly it's almost constant", so the constructions achieve growth "almost n^{1+c}"? But I guess that might be overcorrecting too hard.
- 0x5FC3Is there a reason why we only hear of Erdos problems being solved? I would imagine there are a myriad of other unsolved problems in math, but every single ChatGPT "breakthrough in math" I come across on r/singularity and r/accelerate are Erdos problems.
- m-hodgesTo the “LLMs just interpolate their training data” crowd:Ayer, and in a different way early Wittgenstein, held that mathematical truths don’t report new facts about the world. Proofs unfold what is already implicit in axioms, definitions, symbols, and rules.I think that idea is deeply fascinating, AND have no problem that we still credit mathematicians with discoveries.So either “recombining existing material” isn’t disqualifying, or a lot of Fields Medals need to be returned.
- lubujacksonFor anyone using LLMs heavily for coding, this shouldn't be too surprising. It was just a matter of time.Mathematicians make new discoveries by building and applying mathematical tools in new ways. It is tons of iterative work, following hunches and exploring connections. While true that LLMs can't truly "make discoveries" since they have no sense of what that would mean, they can Monte Carlo every mathematical tool at a narrow objective and see what sticks, then build on that or combine improvements.Reading the article, that seems exactly how the discovery was made, an LLM used a "surprising connection" to go beyond the expected result. But the result has no meaning without the human intent behind the objective, human understanding to value the new pathway the AI used (more valuable than the result itself, by far) and the mathematical language (built by humans) to explore the concept.
- dwrobertsWould be interesting to know what kind of preparatory work actually went into this - how long did it take to construct an input that produced a real result, and how much input did they get from actual mathematicians to guide refining it
- aurareturnOne thing seems for certain is that OpenAI models hold a distinct lead in academics over Anthropic and Google models.For those in academics, is OpenAI the vendor of choice?
- throwaway2027Not to dismiss the AI but the important part is that you still need someone able to recognize these solutions in the first place. A lot of things were just hidden in plain sight before AI but no one noticed or didn't have the framework either in maths or any other field they're specialized in to recognize those feats.
- isolliQuestion:The conjecture was about an upper bound for the maximum number of pairs. It has been disproven.Was the Erdos problem the conjecture itself, or was it about the actual maximum number of pairs? (In which case it will probably never be solved.)The problem is defined in the narrow version here: https://www.erdosproblems.com/90
- throw-the-towelSee the longstanding debate on whether new math is "invented" or "discovered". Most mathematicians I knew thought it's discovered.
- zone411I actually tried using GPT-5.5 Pro on this problem recently. It thought it was making progress on one path, but it made so many mistakes that it didn't feel worth it pushing further. It'll be interesting to check whether it's the same route. I got partial results (proved in Lean) that improve on the best-known results for four Erdős problems with GPT-5.5 Pro
- Jeff_BrownCan anyone find (or draw) a picture of the construction?
- endymi0nTo paraphrase Gwynne Shotwell: “Not too bad for just a large Markov chain, eh?”
- __0x01From the companion paper:> The argument relies crucially on ideas that may, at least in retrospect, be attributed to Ellenberg-Venkatesh, Golod-Shafarevich, and Hajir-Maire-Ramakrishna.Can someone please elaborate on this?
- FraterkesI guess if this stuff is going to make my employment more precarious, it’d be nice if it also makes some scientific breakthroughs. We’ll see
- libraryofbabelThis HN thread depressed me. I’m still thinking about why.Look past the press-releasey gushing from OpenAI and there are all sorts of interesting and subtle questions here about the role for LLMs in mathematical research. I urge folks to click through to the accompanying comments from mathematicians published alongside the result. There is a really interesting discussion going on. I particularly recommend Tim Gowers’ remarks. This is really interesting stuff!Yet the comments are just a battleground of people rehearsing the same tired arguments about LLMs from 2023, refutations of those arguments, angry counters, etc.Does it make anyone else sad that the battle lines seem to have been drawn 3 years ago and we just seem to have the same fights over and over?I wonder if we’ll still be doing this two years hence.
- zmmmmmAs a side observation, it is striking but also not surprising in retrospect that the big successes in AI are coming from domains where things are fundamentally verifiable. Both software and math are either fully verifiable or low-cost verifiable (breaking a test is not the same cost as building a bridge and watching it fall down to see if it worked).Other domains are extracting value but I feel like there's an order of magnitude difference. It raises the question, what other domains fit into these categories where the AI itself has pretty much free reign to verify its own results?
- purpleideaYou'd think a billion dollar company would be able to normalize the sound level on their video :/
- ferris-boolerWhat strikes me in this case (and I haven't seen in other comments) is that it's a _disproof_ of a conjecture put forth by Erdős and supported (at least according to OpenAI) by other professional mathematicians. Erdős, one of the greats, thought that the limit was O(n^{1 + o(1)}), which GPT disproved.We can argue about recombination/interpolation of training data in LLMs, but even if this was an interpolation, the result was contrarian rather than a confirmation. Any system that can identify an error in Erdős's thinking seems very useful to me (though perhaps he did not spend much time thinking about or checking this particular conjecture).
- ks2048Timothy Gowers' tweet about this: "If you are a mathematician, then you may want to make sure you are sitting down before reading futher.".woah.
- num42I am not surprised! The birth of computer science was rooted in the desire to automate mathematical discovery and proof writing.
- anonundefined
- CGMthrowawayHow do you even get an LLM to try to solve one of these problems? When I ask it just comes back with the name of the problem and saying "it can't be done"
- dwa3592Few questions that the blog did not answer, if anyone knows that'll be great:- Does anyone know if this was a 1 minute of inference or 1 month?- How many times did the model say it was done disproving before it was found out that the model was wrong/hallucinating?- One of the graphs say - the model produced the right answer almost half the times at the peak compute??? did i understand that right? what does peak compute mean here?
- dadrianWhile the result is impressive, this blog post is extremely disappointing.- It does not show an example of the new best solution, nor explain why they couldn't show an example (e.g. if the proof was not constructive)- It does not even explain the previous best solution. The diagram of the rescaled unit grid doesn't indicate what the "points" are beyond the normal non-scaled unit grid. I have no idea what to take away from it.- It's description of the new proof just cites some terms of art with no effort made to actually explain the result.If this post were not on the OpenAI blog, I would assume it was slop. I understand advanced pure mathematics is complicated, but it is entirely possible to explain complicated topics to non-experts.
- anonundefined
- precision1kI see mixed emotions here. I understand both. On one hand it's exciting and fascinating. On the other it's concerning. One concern I haven't seen mentioned is the possibility that, as these models become larger and more powerful, their capability to solve frontier math problems will also grow. Does there become a point where humans are no longer the driver of innovation and research in this world, and instead are relegated to become stewards of the AI models whose purpose is to push the boundaries of mathematics, theoretical physics and other academic disciplines?
- Topology1As someone starting grad school for pure mathematics, this has me both excited and nervous, but mainly the latter...
- callamdelaneyThe only relevant question is, how much did it cost?
- agentultraI’m curious about the “autonomous” claim. Usually these systems require a human to guide and verify steps, clarify problems, etc. are they claiming that the reinforcement model wasn’t given any inputs, tools, guidance, or training data from humans?
- footaThey should feed it the classification of finite simple groups and get it to simplify it/turn it more constructive.
- alansaberAI isn't going to supercharge science but I wouldn't be as dismissive as other posters here.
- armanjuseless fact: there is no mention to "gpt" in this article. the ai is referred to as "An internal OpenAI model".
- globulus2023In the article there is a diagram of the “square grid” arrangement that achieves approximately 2n points separated by unit distance.Can anyone point me to a diagram of what the newly found solution looks like?
- taimurshasanI wonder how much this cost vs a Math Professor or a team of Math Professors.
- globulus2023In the article there is a nice clear diagram of the “square grid” arrangement that was previously thought to be optimal.Can anyone point me to a diagram of the newly found optimal arrangement?
- famouswafflesAnother entry in a growing list of the last couple months (interestingly mostly Open AI):1. Erdos 1196, GPT-5.4 Pro - https://www.scientificamerican.com/article/amateur-armed-wit...There are a couple of other Erdos wins, but this was the most impressive, prior to the thread in question. And it's completely unsupervised.Solution - https://chatgpt.com/share/69dd1c83-b164-8385-bf2e-8533e9baba...2. Single-minus gluon tree amplitudes are nonzero , GPT-5.2 https://openai.com/index/new-result-theoretical-physics/3. Frontier Math Open Problem, GPT-5.4 Pro and others - https://epoch.ai/frontiermath/open-problems/ramsey-hypergrap...4. GPT-5.5 Pro - https://gowers.wordpress.com/2026/05/08/a-recent-experience-...5. Claude's Cycles, Claude Opus 4.6 - https://www-cs-faculty.stanford.edu/~knuth/papers/claude-cyc...
- adt
- phkahlerI would have thought a triangular grid works better than a grid of squares. You get ~3n links vs ~2n for the square grid. Curious what the AI came up with.
- momo26I'm curious that giving an counter-example is kind of easy to disprove. But can the model really prove something correctly and rigorously? Cuz now it seems like all the knowledge is based on the existed thing, and none of them can prove a myth.
- anonundefined
- oscordCan it model a sustainable economy model, with human happiness and fulfilment indexes and planet preservation focus? Current capitalism and the red thing are so tired!
- zuzululuThis topic and discussion is out of my league what is the implication here ? LLMs aren't a dead end ?
- SubiculumCodeI wonder whether there will be progress in string theory from these kinds of applications of AI.
- yusufozkan"The proof came from a general-purpose reasoning model, not a system built specifically to solve math problems or this problem in particular, and represents an important milestone for the math and AI communities."
- solomatovHow central is it in the discrete geometry? Could anyone with the knowledge in the field reply?
- sinuhe69How did they jump from finding counter-examples (disproof) to a proof?
- auggieroseWhich model did this? Is it available to the public?
- _heimdallAs this becomes more common it makes me wonder where the LLM ends and the harness begins.The underlying model may still effectively be a stochastic parrot, but used properly that can do impressive things and the various harnesses have been getting better and better at automating the use of said parrot.
- alsetmusic> AI is about to start taking a very serious role in the creative parts of research, and most importantly AI research itself. While this progress is not unexpected, it reinforces the urgency we feel about understanding this next phase of AI development, the challenges of aligning very intelligent systems, and the future of human-AI collaboration.I find this hyperbolic, but ya gotta juice up the upcoming IPO. I hate that they took an interesting announcement and reminded me why I hate tech and our society at the end.
- pizzaoCan someone explain to me what is their "prompting-scaffolding" to make it work ?
- seydorcan the AI please tell us what to do now that all knowledge work will become unemployment?
- ai_fry_ur_brainIm convinved they target these pure math problems because math is very occulted to the masses, and therefor can use math "discoveries" as a way to make an LLM seem more impressive than it is.Everything is a grift.What are the odds that if they ran the same prompt from scratch, with the same context and instructions that it would arrive at the same answer? Unlikely. I think its more likely that this is a 1:500000 chance and OpenAI can afford to brute force this result and justify the expense for marketing.
- AlexToaniAISo nowadays. AI may use different field and get lots of break through that migh human can't done! That's nuts!
- aussieguy1234So we've got the proof, what are the practical applications of this?
- dev1ycanWouldn't surprise me if they're just paying math geniuses to do math research and attribute it to AI models.
- 3422817Nice. By the year 2100 200 Erdos problems will have been solved by AI. Let's build more data centers.
- DiogenesKynikosCalling all LLM skeptics. How did a "stochastic parrot" just disprove an Erdős conjecture that mathematicians couldn't figure out for decades?
- KyeIs this something that can be made explainable to someone without any of the relevant background, or is this one of those things where all that background is needed to understand it? Because I have no idea what's going on here, but would like to.
- catigulaEvery time I interact even with OpenAI's pro model, I am forced to come to the conclusion that anything outside the domain of specific technical problems is almost completely hopeless outside of a simple enhanced search and summary engine.For example, these machines, if scaling intellect so fiercely that they are solving bespoke mathematics problems, should be able to generate mundane insights or unique conjectures far below the level of intellect required for highly advanced mathematics - and they simply do not.Ask a model to give you the rundown and theory on a specific pharmacological substance, for example. It will cite the textbook and meta-analyses it pulls, but be completely incapable of any bespoke thinking on the topic. A random person pursuing a bachelor's in chemistry can do this.Anything at all outside of the absolute facts, even the faintest conjecture, feels completely outside of their reach.
- empath75Important note: this was not done with a special mathematics harness or specialized workflow.
- overgardI think it's worth being skeptical of this.. there's a way too common pattern of "AI Lab Shows AI Doing Something Only Humans Can Do" only for a bunch of important caveats and limitations to be discovered after the initial hype. And of course, the correction never seems to be as viral as the hype. I'll believe it when a mathematician actually reads the 100+ pages of reasoning.
- arsan87neato. can we do any thing with this new found knowledge or is this mathematical sports?can we please put these ground breaking AIs to work on actual problems humans have?
- somewhereoutthThe real test would be if an LLM makes an important conjecture.
- analognoiseBack when “term rewriting” was “AI”, multiple math tools were released that took known math facts and did tricks like uncovering new integrals - apply the pattern in some depth in a tree, see what pops out.What was discovered were numerous mistakes in the published literature on the subject. “New math! AI!” No, just mechanical application of rules, human mistakes.There were things that were theorized, but couldn’t be exhaustively checked until computers were bigger.Once again, a tool is applied, it has the AI label - its progress! But it isn’t something new. It’s just an LLM.There’s a consistent under appreciation of AI (and math, honestly), but watching soulless AI mongers declare that their toy has created the new is something of a new low; uninspired, failed creatives, without rhyme or context; this is a bigger version of declaring that your spell checker has created new words.The result is more impressive than what was done with tables of integrals and SAINT in 1961, sure.Apparently if you add a “temperature” knob to a text predictor, otherwise sane individuals piss themselves and call it new.Then again I thought NFTs, crypto, and the Metaverse were stupid, so what do I know.
- neuroelectronI wonder if it has anything to do with the fact that AI is a grid of grid-calculating grids. It seems like it would be especially well suited to finding solutions about grids. That is until you consider the fact that even 1 trillion billion grids is still not anywhere close to an infinite grid. So, probably slop.
- iLoveOncallAbsolutely no proof that any LLM actually found the result, and just a mention of an "internal model". Served to you by one of the biggest liars in the world.Why would anyone believe this to be true even for a split second?
- epicsagas[flagged]
- spacebacon[flagged]
- yathartha[dead]
- NexraGear[flagged]
- OldGreenYodaGPT[dead]
- rohitsriram[dead]
- xiaod[flagged]
- smailbkf[dead]
- ShadowPulse4709[flagged]
- throwaway613746[dead]
- buddhahastha[flagged]
- dist-epoch[flagged]
- bradleykingzok. so what are the implications of for math
- mrcwinnThe back and forth in this discussion reveals to me we are sorting through a kind of philosophical debate about intelligence. That alone tells me LLMs are doing something novel.
- brcmthrowawayEnd times are approaching
- ninjagooMany folks are upset about the supplanting of human effort by ai. Umanwizard voiced this valid concern below [1], but his comment got downvoted, unfairly, IMHO, instead of just being addressed. So putting out at least my response as its own top-level comment for visibility.> the closer the expertise you spent your whole life building is to being worthless.Perhaps it is time for life to be considered intrinsically valuable, instead of being "worthy" only based on output or capability. Disability, animal and environmental advocates have been fighting for this for a long time. Not too long ago women and minorities were in the same boat. Even now, there are many advocating and fighting for a return to the dark old days.> Along with all the rest of what humans find meaningful and fulfilling.Some humans. Many are content to enjoy simply existing, and the beauty of life and the universe around us. Just like many non-scientists today enjoy and benefit from the work of scientists, tomorrow too many will enjoy learning from, and applying the coming advancements and leaps in many fields.And those of a scientist or other research-type mindset? No doubt they will contribute meaningfully by studying the frontier, noting what remains unanswered, and then advancing the frontier, just like researchers do today; just because scientists in the past solved many questions doesn't mean that there aren't any questions to answer today.IMHO, AI means that the frontier expands faster, not that it is obliterated. Even AI cannot overcome the laws and limitations of physics/universe: even Dyson spheres only capture the energy of one star, thus setting a limit on the amount of compute, and thereby a limit on intelligence. And we are a loooong way from a Dyson sphere.[1] https://news.ycombinator.com/item?id=48215122
- fromMarsSeems rather depressing to me but maybe I am a Luddite.
- voooduuuuuAsk an LLM to invent a new word and post it here. You will see that it simply combines words already in the training data.
- atleastoptimalTo all AI skeptics:What is preventing AI from continuing to improve until it is absolutely better than humans at any mental task?If we compare AI now vs 2022 the difference is outstandingly stark. Do you believe this improvement will just stop before it eclipses all humans in everything we care about?
- cwmooreFrom the meandering and self-loving article:“ For decades, it was widely believed that this rate was essentially the best possible, and no construction could improve significantly over the square grid. In technical terms, Erdős conjectured an upper bound of n 1 + o ( 1 ) n 1+o(1) in which the additional o ( 1 ) o(1) indicates a term tending to 0 0 with n n.Our new result disproves this conjecture. More precisely, for infinitely many values of n n, the proof constructs configurations of n n points with at least n 1 + δ n 1+δ unit-distance pairs, for some fixed exponent δ > 0 δ>0. (The original AI proof does not give an explicit δ δ, but a forthcoming refinement due to Princeton mathematics professor Will Sawin has shown one can take δ = 0.014 δ=0.014.)”
- reactordevI dunno, I'm skeptical without proof. I've had the MAX+ plan for a while and I'm sorry, the quality between GPT vs Claude is night and day difference. Claude understands. GPT stumbles over every request I give it.