Erdos 281 solved with ChatGPT 5.2 Pro

<- Back

Erdos 281 solved with ChatGPT 5.2 Pro

Comments (140)

xeeeeeeeeeeenu
> no prior solutions found.This is no longer true, a prior solution has just been found[1], so the LLM proof has been moved to the Section 2 of Terence Tao's wiki[2].[1] - https://www.erdosproblems.com/forum/thread/281#post-3325[2] - https://github.com/teorth/erdosproblems/wiki/AI-contribution...
doctoboggan
Can anyone give a little more color on the nature of Erdos problems? Are these problems that many mathematicians have spend years tackling with no result? Or do some of the problems evade scrutiny and go un-attempted for most of the time?EDIT: After reading a link someone else posted to Terrance Tao's wiki page, he has a paragraph that somewhat answers this question:> Erdős problems vary widely in difficulty (by several orders of magnitude), with a core of very interesting, but extremely difficult problems at one end of the spectrum, and a "long tail" of under-explored problems at the other, many of which are "low hanging fruit" that are very suitable for being attacked by current AI tools. Unfortunately, it is hard to tell in advance which category a given problem falls into, short of an expert literature review. (However, if an Erdős problem is only stated once in the literature, and there is scant record of any followup work on the problem, this suggests that the problem may be of the second category.)from here: https://github.com/teorth/erdosproblems/wiki/AI-contribution...
pessimist
From Terry Tao's comments in the thread:"Very nice! ... actually the thing that impresses me more than the proof method is the avoidance of errors, such as making mistakes with interchanges of limits or quantifiers (which is the main pitfall to avoid here). Previous generations of LLMs would almost certainly have fumbled these delicate issues....I am going ahead and placing this result on the wiki as a Section 1 result (perhaps the most unambiguous instance of such, to date)"The pace of change in math is going to be something to watch closely. Many minor theorems will fall. Next major milestone: Can LLMs generate useful abstractions?
sequin
FWIW, I just gave Deepseek the same prompt and it solved it too (much faster than the 41m of ChatGPT). I then gave both proofs to Opus and it confirmed their equivalence.The answer is yes. Assume, for the sake of contradiction, that there exists an \(\epsilon > 0\) such that for every \(k\), there exists a choice of congruence classes \(a_1^{(k)}, \dots, a_k^{(k)}\) for which the set of integers not covered by the first \(k\) congruences has density at least \(\epsilon\).For each \(k\), let \(F_k\) be the set of all infinite sequences of residues \((a_i)_{i=1}^\infty\) such that the uncovered set from the first \(k\) congruences has density at least \(\epsilon\). Each \(F_k\) is nonempty (by assumption) and closed in the product topology (since it depends only on the first \(k\) coordinates). Moreover, \(F_{k+1} \subseteq F_k\) because adding a congruence can only reduce the uncovered set. By the compactness of the product of finite sets, \(\bigcap_{k \ge 1} F_k\) is nonempty.Choose an infinite sequence \((a_i) \in \bigcap_{k \ge 1} F_k\). For this sequence, let \(U_k\) be the set of integers not covered by the first \(k\) congruences, and let \(d_k\) be the density of \(U_k\). Then \(d_k \ge \epsilon\) for all \(k\). Since \(U_{k+1} \subseteq U_k\), the sets \(U_k\) are decreasing and periodic, and their intersection \(U = \bigcap_{k \ge 1} U_k\) has density \(d = \lim_{k \to \infty} d_k \ge \epsilon\). However, by hypothesis, for any choice of residues, the uncovered set has density \(0\), a contradiction.Therefore, for every \(\epsilon > 0\), there exists a \(k\) such that for every choice of congruence classes \(a_i\), the density of integers not covered by the first \(k\) congruences is less than \(\epsilon\).\boxed{\text{Yes}}
dust42
Personally, I'd prefer if the AI models would start with a proof of their own statements. Time and again, SOTA frontier models told me: "Now you have 100% correct code ready for production in enterprise quality." Then I run it and it crashes. Or maybe the AI is just being tongue-in-cheek?Point in case: I just wanted to give z.ai a try and buy some credits. I used Firefox with uBlock and the payment didn't go through. I tried again with Chrome and no adblock, but now there is an error: "Payment Failed: p.confirmCardPayment is not a function." The irony is, that this is certainly vibe-coded with z.ai which tries to sell me how good they are but then not being able to conclude the sale.And we will get lots more of this in the future. LLMs are a fantastic new technology, but even more fantastically over-hyped.
carbocation
The erdosproblems thread itself contains comments from Terence Tao: https://www.erdosproblems.com/forum/thread/281
energy123
A surprising % of these LLM proofs are coming from amateurs.One wonders if some professional mathematicians are instead choosing to publish LLM proofs without attribution for career purposes.
redbluered
Has anyone verified this?I've "solved" many math problems with LLMs, with LLMs giving full confidence in subtly or significantly incorrect solutions.I'm very curious here. The Open AI memory orders and claims about capacity limits restricting access to better models are interesting too.
ashleyn
I guess the first question I have is if these problems solved by LLMs are just low-hanging fruit that human researchers either didn't get around to or show much interest in - or if there's some actual beef here to the idea that LLMs can independently conduct original research and solve hard problems.
niemandhier
Is there explainability research for this type of model application? E.g. a sparse auto encoder or something similar but more modern.I would love to know which concepts are active in the deeper layers of the model while generating the solution.Is there a concept of “epsilon” or “delta”?What are their projections on each other?
wewxjfq
The LLMs that take 10 attempts to un-zero-width a <div>, telling me that every single change totally fixed the problem, are cracking the hardest math problems again.
beders
Has anyone confirmed the solution is not in the training data? Otherwise it is just a bit information retrieval LLM style. No intelligence necessary.
a_tartaruga
Out of curiosity why has the LLM math solving community been focused on the Erdos problems over other open problems? Are they of a certain nature where we would expect LLMs to be especially good at solving them?
dernett
This is crazy. It's clear that these models don't have human intelligence, but it's undeniable at this point that they have _some_ form of intelligence.
renewiltord
It’s funny. in some kind of twisted variant of Cunningham’s Law we have:> the best way to find a previous proof of a seemingly open problem on the internet is not to ask for it; it's to post a new proof
mikert89
I have 15 years of software engineering experience across some top companies. I truly believe that ai will far surpass human beings at coding, and more broadly logic work. We are very close
logicallee
how did they do it? Was a human using the chat interface? Did they just type out the problem and immediately on the first reply received a complete solution (one-shot) or what was the human's role? What was ChatGPT's thinking time?
jrflowers
Narrator: The solution had already appeared several times in the training data
IAmGraydon
This is showing as unresolved here, so I'm assuming something was retracted.https://mehmetmars7.github.io/Erdosproblems-llm-hunter/probl...
magicalist
Funny seeing silicon valley bros commenting "you're on fire!" to Neel when it appears he copied and pasted the problem verbatim into chatGPT and it did literally all the other work herehttps://chatgpt.com/share/696ac45b-70d8-8003-9ca4-320151e081...
ares623
This must be what it feels like to be a CEO and someone tells me they solved coding.