AI's latest challenge: Math Olympics
For four years, computer scientist Trieu Trinh has been stuck with a meta-mathematical problem: how to build an AI model that solves geometry problems from the International Mathematical Olympiad, a competition for the world's most mathematically trained high-school students. There is an annual competition. ,
Last week Dr. Trinh successfully defended his doctoral dissertation on this topic at New York University; This week, they described the results of their labor in the journal Nature. nominated alpha geometryThis system solves Olympiad geometry problems at approximately the level of a human gold medalist.
While developing the project, Dr. Trinh pitched it to two research scientists at Google, and they appointed him as a resident from 2021 to 2023. AlphaGeometry joins Google DeepMind's fleet of AI systems, which are known for tackling big challenges. perhaps most famously, alphazero, a deep-learning algorithm that conquered chess in 2017. Mathematics is a hard problem, because the number of possible paths toward a solution is sometimes infinite; Chess is always limited.
“I kept running into dead ends, going down the wrong path,” said Dr. Trinh, the project's lead author and driving force.
The paper's co-authors are Dr. Trinh's doctoral advisor, He He, at New York University; Yuhui Wu, known as Tony, was a co-founder of xAI (formerly at Google), who began exploring a similar idea independently in 2019; Thang Luong, principal investigator, and Quoc Le, both from Google DeepMind.
Dr. Trinh's persistence paid off. “We are not making incremental improvements,” he said. “We are making a giant leap, a giant breakthrough in terms of results.”
“Just don’t blow it out of proportion,” he said.
Dr. Trinh presented the Alpha Geometry system with a test set of 30 Olympiad geometry problems drawn from 2000 to 2022. The system solved 25; Historically, over the same period, the average human gold medalist solved 25.9. Dr. Trinh also pointed out the problems with the system developed in the 1970s, which was considered the most robust geometry theorem proverb, This 10 is solved.
Over the past few years, Google DeepMind has investigated many projects Application of AI in Mathematics, More broadly in this research area, Olympiad mathematics problems have been adopted as a benchmark; OpenAI and Meta AI have achieved some results. For additional inspiration, there is imo grand challengeAnd in November a new challenge was announced Artificial Intelligence Mathematical Olympiad AwardThe first AI to win Olympiad gold will be awarded a $5 million pot.
The Alphageometry paper begins with the argument that proving the Olympiad theorems “represents a remarkable milestone in human-level automated reasoning.” Michael Barney, a historian of mathematics and science at the University of Edinburgh, said he wondered whether it was a meaningful mathematical milestone. “What IMO is testing is very different from what creative mathematics looks like for most mathematicians,” he said.
Terrence Tao, a mathematician at the University of California, Los Angeles – and the youngest Olympiad gold medalist, when he was 12 – said he thought alphageometry was “good work” and that they achieved “surprisingly strong results”. Had done. Fixing an AI-system to solve Olympiad problems may not improve its deep research skills, he said, but in this case the journey may prove more valuable than the destination.
As Dr. Trinh sees it, mathematical logic is just one type of logic, but it has the advantage of being easily verified. “Mathematics is the language of truth,” he said. “If you want to build AI, it's important to build a truth-finding, trustworthy AI that you can trust,” especially for “safety critical applications.”
proof of concept
Alphageometry is a “neuro-symbolic” system. It combines a neural net language model (good at artificial intuition but smaller, like ChatGPT) with a symbolic engine (good at artificial reasoning, like a logical calculator).
And it's custom-built for the geometry. “Euclidean geometry is a good test bed for automated reasoning, because it constitutes a self-contained domain with fixed rules,” said Heather MacBeth, a geometry expert at Fordham University and an expert in computer-verified logic. (As a teenager, Dr. Macbeth won two IMO medals.) Alpha geometry “seems to be good progress,” she said.
This system has two particularly innovative features. First, the neural net is trained only on data generated by the algorithm – 100 million geometric proofs – without using any human examples. The use of synthetic data created from scratch overcame a barrier to automated theorem-proving: the lack of human-proof training data translated into machine-readable language. “To be honest, initially I had some doubts about how successful this would be,” Dr. He said.
Second, once alphageometry was turned loose on a problem, the symbolic engine began solving; If it got stuck, the neural net suggested ways to extend the proof argument. The loop continued until a solution was found, or until time ran out (four and a half hours). In the language of mathematics, this growth process is called “auxiliary construction”. Add a line, bisect an angle, draw a circle – that's how mathematicians, students or elites, tinker and try to reach agreement on a problem. In this system, neural nets learned to create assistants, and in a humane way. Dr. Trinh compared it to wrapping a rubber band around a stubborn jar lid to help the hand get a better grip.
“This is a very interesting proof of concept,” said XAI co-founder Christian Szegedy, who was previously at Google. But it “leaves a lot of questions open,” he said, and it is “not easily generalizable to other domains and other areas of mathematics.”
Dr. Trinh said he would try to generalize the system to mathematical fields and beyond. He said he wanted to step back and consider the “common underlying principle” of all types of arguments.
Stanislas Dehaene, a cognitive neuroscientist at the Collège de France who has a research interest In fundamental geometric knowledge, he said he was impressed by the performance of alpha geometry. But he observed that “it does not 'see' anything about the problems it solves” – rather, it simply takes logical and numerical encoding of the pictures. (The illustrations in the paper are for the benefit of the human reader.) “There is no spatial notion of circles, lines and triangles that the system learns to manipulate,” Dr. Dehaene said. Researchers agreed that a visual component may be valuable; Dr. Luong said it could be added, possibly within the year, using Google's Gemini, a “multimodal” system that accommodates both text and images.
In early December, Dr. Luong visited his old home High School in Ho Chi Minh City, Vietnam, and showed Alpha Geometry to his former teacher and IMO coach, Le Ba Khanh Trinh. Dr. Lee was the top gold medalist in the 1979 Olympiad and won a special prize for his brilliant geometry solution. Dr. Li analyzed one of the proofs of alphageometry and found it remarkable but unsatisfactory, Dr. Luong recalled: “He found it mechanical, and said it lacked soul, lacked the beauty of the solution that He wants.”
Dr. Trinh first asked Ivan Chen, a mathematics doctoral student at MIT – and an IMO coach and Olympiad gold medalist – to check out some of the work on alpha geometry. This was correct, Mr. Chen said, and he added that he was amazed at how well the system found solutions.
He said, “I would like to know how the machine is producing it.” “But, I mean, for that matter, I'd also like to know how humans come up with solutions.”