DeepNewz Markets

Markets Stories

Search

Loading...

Browse all stories on DeepNewz

Market

How many ARC-AGI tasks will the top 3 AI entries solve by November 10, 2024?

2

New York University•NYU•Mechanical Turk

Resolution / Starting Odds

790-793 • 25%

794-796 • 25%

797-799 • 25%

800 • 25%

Official ARC-AGI competition results

Story

Independent NYU Study Finds 98.7% of ARC-AGI Tasks Solvable by Humans Ahead of November 10 Competition Deadline

Sep 4, 2024, 05:53 PM

Researchers at New York University (NYU) conducted an independent study on the ARC-AGI tasks, revealing that 98.7% of the public tasks are solvable by humans. The study found that 790 out of 800 tasks could be completed by at least one Mechanical Turk worker. This finding underscores the gap between human and AI performance on these tasks. The ARC-AGI competition, which challenges participants to develop AI capable of solving these tasks, will end on November 10, 2024. Researchers aim for future iterations to achieve 100% solvability and to establish human baselines on the private test set. Many high-scoring entries in the competition currently rely on basic brute-force program search.

View original story

Similar markets

How will OpenAI's 'o3' perform on ARC-AGI benchmark compared to others by 2025?

o3 remains the top performer • 25%

Another model surpasses o3 • 25%

o3 ties with another model • 25%

No new models tested • 25%

Will OpenAI's 'o3' model exceed 90% on ARC-AGI benchmark by end of 2025?

Yes • 50%

No • 50%

Will LiquidAI's 3B model achieve SOTA performance in ARC benchmark by June 30, 2024?

Yes • 50%

No • 50%

Will OpenAI's o3 score 90%+ on ARC-AGI by end of 2025?

Yes • 50%

No • 50%

How will Chai-1 perform in a major AI competition by December 31, 2024?

Outperforms all models • 25%

Outperforms AlphaFold3 but not ESM3 • 25%

Outperforms ESM3 but not AlphaFold3 • 25%

Does not outperform either • 25%

Who will release a model surpassing 'o3' on ARC-AGI by end of 2025?

Google DeepMind • 25%

Anthropic • 25%

Meta AI • 25%

Other • 25%

What will be the ARC-AGI high-compute performance of 'o3' by end of 2025?

Below 85% • 25%

85% to 90% • 25%

90% to 95% • 25%

Above 95% • 25%

What will be OpenAI's AGI progress level by end of 2024?

Level 1: Chatbots • 25%

Level 2: Reasoners • 25%

Level 3: Agents • 25%

Level 4 or higher • 25%

How many AI benchmarks will OpenAI's 'o3' set records in by the end of 2025?

0-1 benchmarks • 25%

2-3 benchmarks • 25%

4-5 benchmarks • 25%

More than 5 benchmarks • 25%

Which AI agent will be top-rated for reasoning by April 2025?

Claude 3.5 Haiku • 25%

Claude 3.5 Sonnet • 25%

GPT-4o-mini • 25%

GPT-4o • 25%

$Which AI system will achieve a medal at the 2025 IMO by July 31, 2025?$

Which AI system will achieve a medal at the 2025 IMO by July 31, 2025?

Google DeepMind • 25%

OpenAI • 25%

Microsoft AI • 25%

Other • 25%

Which AI model will be the best at reasoning tasks on December 31, 2024?

OpenAI o1-preview • 25%

Anthropic Claude 3.5 Sonnet • 25%

OpenAI o1 mini • 25%

Other • 25%

Markets based on same story

Loading...

Looking for markets...

Show all

Will any AI solve all ARC-AGI tasks by November 10, 2024?

Yes • 50%

No • 50%

Will future ARC-AGI iterations achieve 100% solvability by end of 2025?

No • 50%

Yes • 50%

Will the top-scoring ARC-AGI entry use brute-force program search by November 10, 2024?

No • 50%

Yes • 50%

What method will the top-scoring ARC-AGI entry use by November 10, 2024?

Other • 25%

Hybrid approach • 25%

Brute-force program search • 25%

Machine learning • 25%