DeepNewz Markets

Market

Will the top-scoring ARC-AGI entry use brute-force program search by November 10, 2024?

New York University•NYU•Mechanical Turk

Resolution / Starting Odds

Yes • 50%

No • 50%

Official ARC-AGI competition results

Story

Independent NYU Study Finds 98.7% of ARC-AGI Tasks Solvable by Humans Ahead of November 10 Competition Deadline

Sep 4, 2024, 05:53 PM

Researchers at New York University (NYU) conducted an independent study on the ARC-AGI tasks, revealing that 98.7% of the public tasks are solvable by humans. The study found that 790 out of 800 tasks could be completed by at least one Mechanical Turk worker. This finding underscores the gap between human and AI performance on these tasks. The ARC-AGI competition, which challenges participants to develop AI capable of solving these tasks, will end on November 10, 2024. Researchers aim for future iterations to achieve 100% solvability and to establish human baselines on the private test set. Many high-scoring entries in the competition currently rely on basic brute-force program search.

View original story

$Which AI system will achieve a medal at the 2025 IMO by July 31, 2025?$

Which AI system will achieve a medal at the 2025 IMO by July 31, 2025?

Google DeepMind • 25%

OpenAI • 25%

Microsoft AI • 25%

Other • 25%

$Will Google DeepMind AI achieve a gold medal score at the 2025 IMO by July 31, 2025?$

o3 ties with another model • 25%

No new models tested • 25%

How will Chai-1 perform in a major AI competition by December 31, 2024?

Outperforms all models • 25%

Outperforms AlphaFold3 but not ESM3 • 25%

Outperforms ESM3 but not AlphaFold3 • 25%

Does not outperform either • 25%

Market

Story

Similar markets

Will a significant AGI breakthrough be announced as a result of the new supercomputing network by December 31, 2024?

Will Arctic-SnowCoder-1.3B be used in a winning submission for a major coding competition by mid-2024?

Will Google's AI Cyber Challenge team win the DARPA AI Cyber Challenge by the end of 2024?

Which AI system will achieve a medal at the 2025 IMO by July 31, 2025?

Will Google DeepMind AI achieve a gold medal score at the 2025 IMO by July 31, 2025?

Will OpenAI's 'o3' model exceed 90% on ARC-AGI benchmark by end of 2025?

Will AI agents based on OpenAI's o1 model outperform human developers in coding competitions by June 30, 2025?

How will OpenAI's 'o3' perform on ARC-AGI benchmark compared to others by 2025?

How will Chai-1 perform in a major AI competition by December 31, 2024?

Will OpenAI's o3 score 90%+ on ARC-AGI by end of 2025?

Will LiquidAI's 3B model achieve SOTA performance in ARC benchmark by June 30, 2024?

Will XBOW AI pentester achieve a perfect score on a web pentesting benchmark by June 2025?

Will any AI solve all ARC-AGI tasks by November 10, 2024?

Will future ARC-AGI iterations achieve 100% solvability by end of 2025?

How many ARC-AGI tasks will the top 3 AI entries solve by November 10, 2024?

What method will the top-scoring ARC-AGI entry use by November 10, 2024?

Will any AI solve all ARC-AGI tasks by November 10, 2024?

Will future ARC-AGI iterations achieve 100% solvability by end of 2025?

How many ARC-AGI tasks will the top 3 AI entries solve by November 10, 2024?

What method will the top-scoring ARC-AGI entry use by November 10, 2024?