DeepNewz Markets

Market

Will AgentHarm dataset receive a major update by June 30, 2025?

AI Safety Institute•GraySwanAI•AgentHarm

Resolution / Starting Odds

Yes • 50%

No • 50%

Official announcements from the AI Safety Institute or GraySwanAI

Story

AI Safety Institute Releases AgentHarm to Measure LLM Agent Harmfulness on October 14, 2024

Oct 15, 2024, 02:22 PM

The AI Safety Institute, in collaboration with GraySwanAI, has announced the release of AgentHarm, a novel dataset designed to measure the harmfulness of large language model (LLM) agents. This benchmark focuses on unique harms from AI agents with access to external tools, addressing a critical gap in current safety evaluations. Announced on October 14, 2024, AgentHarm is comprehensive, reliable, and easy to run, allowing for widespread use. The initiative highlights the need for robust safety mechanisms as LLM agents become more integrated with external systems. Jailbreaking transfers to LLM agents without degrading capabilities, and the dataset is partly public.

View original story

Jailbreaking Resistance Testing • 25%

Other • 25%

Will Hyperbolic Labs release a new AI model by June 30, 2025?

Yes • 50%

No • 50%

Which major AI conference will first feature AgentHarm by end of 2025?

NeurIPS 2024 • 25%

ICML 2025 • 25%

AAAI 2025 • 25%

Other • 25%

Will Anthropic release a major update to Claude Artifacts by March 31, 2025?

Yes • 50%

No • 50%

What will be the next significant update from LAION regarding the dataset by the end of 2024?

Further data cleaning measures • 25%

Introduction of new ethical guidelines • 25%

Partnership with another organization • 25%

Other • 25%

Market

Story

Similar markets

Will AgentHarm be updated with new safety metrics by June 30, 2025?

Will AgentHarm be adopted by a major AI company for internal evaluations by March 31, 2025?

Will AgentHarm be a standard benchmark in AI safety research by end of 2025?

What will be the primary use case for AgentHarm by October 14, 2025?

Will Hyperbolic Labs release a new AI model by June 30, 2025?

Which major AI conference will first feature AgentHarm by end of 2025?

Will Anthropic release a major update to Claude Artifacts by March 31, 2025?

What will be the next significant update from LAION regarding the dataset by the end of 2024?

Will Scale AI release another significant update to PlanSearch by December 31, 2024?

Will Anthropic announce a new AI model as safe for public release after U.S. government evaluation by June 30, 2025?

Will Amazon's AI video generator receive a major update by March 31, 2025?

Will MIT's 'Future You' AI receive a major update by March 31, 2025?

Will AgentHarm be adopted as a standard benchmark by three major AI companies by March 31, 2025?

Will a significant vulnerability be discovered in AgentHarm's methodology by December 31, 2024?

First sector to report significant impact from AgentHarm by May 31, 2025?

Primary focus of next AI Safety Institute project by April 30, 2025?

Will AgentHarm be adopted as a standard benchmark by three major AI companies by March 31, 2025?

Will a significant vulnerability be discovered in AgentHarm's methodology by December 31, 2024?

First sector to report significant impact from AgentHarm by May 31, 2025?

Primary focus of next AI Safety Institute project by April 30, 2025?