Epoch confirms GPT5.4 Pro solved a frontier math open problem
TL;DR Highlight
GPT-5.4 Pro is the first to solve a FrontierMath open problem (Ramsey-style hypergraph) — Opus 4.6 and Gemini 3.1 Pro also confirmed it afterward
Who Should Read
Researchers tracking AI capability progress; ML engineers interested in math AI benchmarks
Core Mechanics
- GPT-5.4 Pro solved an Epoch AI FrontierMath open problem confirmed by the problem contributor — to be published
- After a general scaffold was deployed, Opus 4.6 (max), Gemini 3.1 Pro, and GPT-5.4 (xhigh) also solved the same problem
- HN debate: counterarguments to "LLMs cannot produce novel ideas" remain ongoing
Evidence
- Epoch AI official confirmation: problem contributor Prof. Will Brian (UNC Charlotte) validated the solution, publication planned
- Opus 4.6 consumed ~250K tokens — token consumption potentially proxies problem difficulty
How to Apply
- Reference FrontierMath Open Problems (epoch.ai/frontiermath/open-problems) when benchmarking AI model math capabilities
- Use unsolved problem challenges rather than standard benchmarks to assess real capability limits of frontier models
Terminology
FrontierMathBenchmark of extremely challenging math problems that stump even professional mathematicians (created by Epoch AI)
라므지 문제(Ramsey Problem)Combinatorics problem of constructing the largest graph that does not satisfy a certain condition