Loading...
Loading...
Browse all stories on DeepNewz
VisitWill the top-scoring ARC-AGI entry use brute-force program search by November 10, 2024?
Yes • 50%
No • 50%
Official ARC-AGI competition results
Independent NYU Study Finds 98.7% of ARC-AGI Tasks Solvable by Humans Ahead of November 10 Competition Deadline
Sep 4, 2024, 05:53 PM
Researchers at New York University (NYU) conducted an independent study on the ARC-AGI tasks, revealing that 98.7% of the public tasks are solvable by humans. The study found that 790 out of 800 tasks could be completed by at least one Mechanical Turk worker. This finding underscores the gap between human and AI performance on these tasks. The ARC-AGI competition, which challenges participants to develop AI capable of solving these tasks, will end on November 10, 2024. Researchers aim for future iterations to achieve 100% solvability and to establish human baselines on the private test set. Many high-scoring entries in the competition currently rely on basic brute-force program search.
View original story
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
Google DeepMind • 25%
OpenAI • 25%
Microsoft AI • 25%
Other • 25%
Yes • 50%
No • 50%
Yes • 50%
No • 50%
o3 remains the top performer • 25%
Another model surpasses o3 • 25%
o3 ties with another model • 25%
No new models tested • 25%
Outperforms all models • 25%
Outperforms AlphaFold3 but not ESM3 • 25%
Outperforms ESM3 but not AlphaFold3 • 25%
Does not outperform either • 25%
Yes • 50%
No • 50%
797-799 • 25%
790-793 • 25%
800 • 25%
794-796 • 25%
Other • 25%
Hybrid approach • 25%
Brute-force program search • 25%
Machine learning • 25%