DeepNewz Markets

Market

Which research paper will cite the PlanSearch algorithm the most by the end of 2024?

AI•California Institute of Technology•Northeastern University•Cursor AI•PlanSearch•LiveCodeBench

Resolution / Starting Odds

A paper from Caltech • 25%

A paper from Northeastern University • 25%

A paper from Cursor AI • 25%

Other • 25%

Research paper citation counts from sources like Google Scholar or Semantic Scholar

Story

Scale AI's PlanSearch Boosts Claude 3.5 Code Generation to 77.0%

Sep 8, 2024, 03:06 PM

Scale AI, in collaboration with the California Institute of Technology, Northeastern University, and Cursor AI, has introduced a new state-of-the-art (SOTA) test-time compute method called PlanSearch. This algorithm enhances diversity and efficiency in large language model (LLM) code generation by creating high-level plans in natural language that guide the coding process. The method has shown significant improvements in creativity, diversity in solutions, and the quality of code generated. Notably, Claude 3.5, using PlanSearch, achieved a pass@200 of 77.0% on LiveCodeBench, outperforming the best score achieved without the search algorithm (pass@1 = 41.4%). The optillm lib implements the core idea of PlanSearch, optimizing inference proxy. This development, involving researchers E Wang, F Cassano, C Wu, and Y Bai, represents a significant leap in optimizing inference capabilities and performance of LLMs.

View original story

Similar markets

In which field will PaperQA2 be most cited by the end of 2024?

Biology • 25%

Chemistry • 25%

Physics • 25%

Other • 25%

How many academic papers will cite Gemma 2 Series by Dec 31, 2024?

Less than 50 • 33%

50-100 • 33%

More than 100 • 34%

Will the study be cited in at least 50 papers by end of 2025?

Yes • 50%

No • 50%

Which research institution will publish the most cited paper on Long COVID in 2024?

UCSF • 25%

Harvard • 25%

Johns Hopkins • 25%

Other • 25%

Which research paper will be the most cited in relation to Google DeepMind's MoNE framework by end of 2024?

G. Jain's paper • 25%

N. Hegde's paper • 25%

A. Kusupati's paper • 25%

A. Nagrani's paper • 25%

How many research papers will cite Promptriever by December 31, 2024?

Less than 50 • 25%

50 to 100 • 25%

100 to 150 • 25%

More than 150 • 25%

What will be the next significant feature or improvement added to PlanSearch by June 30, 2025?

Speed Optimization • 25%

Accuracy Improvement • 25%

Resource Efficiency • 25%

Other • 25%

Which pathology foundation model will be most cited in academic papers by end of 2024?

UNI • 25%

Virchow • 25%

Prov-GigaPath • 25%

Other • 25%

Will MindSearch be used in at least one published academic paper by the end of 2024?

Yes • 50%

No • 50%

How many research papers will cite the sim-to-real method for in-hand manipulation by end of 2024?

Less than 10 • 25%

10 to 50 • 25%

51 to 100 • 25%

More than 100 • 25%

How many research papers using the Virchow2G model will be published by August 15, 2025?

0-10 papers • 25%

11-20 papers • 25%

21-30 papers • 25%

31+ papers • 25%

How many citations will the AI Scientist system's paper (2408.06292) receive by end of 2024?

Less than 50 • 25%

50-100 • 25%

101-200 • 25%

More than 200 • 25%

Market

Story

Similar markets

In which field will PaperQA2 be most cited by the end of 2024?

How many academic papers will cite Gemma 2 Series by Dec 31, 2024?

Will the study be cited in at least 50 papers by end of 2025?

Which research institution will publish the most cited paper on Long COVID in 2024?

Which research paper will be the most cited in relation to Google DeepMind's MoNE framework by end of 2024?

How many research papers will cite Promptriever by December 31, 2024?

What will be the next significant feature or improvement added to PlanSearch by June 30, 2025?

Which pathology foundation model will be most cited in academic papers by end of 2024?

Will MindSearch be used in at least one published academic paper by the end of 2024?

How many research papers will cite the sim-to-real method for in-hand manipulation by end of 2024?

How many research papers using the Virchow2G model will be published by August 15, 2025?

How many citations will the AI Scientist system's paper (2408.06292) receive by end of 2024?

Will another major AI company adopt PlanSearch by the end of 2024?

Will Claude 3.5 achieve a pass@200 score of 80% or higher on LiveCodeBench by the end of 2024?

Will PlanSearch be integrated into a publicly available open-source LLM library by mid-2024?

Which large language model will achieve the highest pass@200 score on LiveCodeBench using PlanSearch by the end of 2024?

Will another major AI company adopt PlanSearch by the end of 2024?

Will Claude 3.5 achieve a pass@200 score of 80% or higher on LiveCodeBench by the end of 2024?

Will PlanSearch be integrated into a publicly available open-source LLM library by mid-2024?

Which large language model will achieve the highest pass@200 score on LiveCodeBench using PlanSearch by the end of 2024?