Loading...
Loading...
Browse all stories on DeepNewz
VisitWhich AI company will first integrate AgentHarm into safety evaluations by July 31, 2025?
OpenAI • 25%
Google DeepMind • 25%
Anthropic • 25%
Other • 25%
Official announcements or press releases from AI companies
AI Safety Institute Releases AgentHarm to Measure LLM Agent Harmfulness on October 14, 2024
Oct 15, 2024, 02:22 PM
The AI Safety Institute, in collaboration with GraySwanAI, has announced the release of AgentHarm, a novel dataset designed to measure the harmfulness of large language model (LLM) agents. This benchmark focuses on unique harms from AI agents with access to external tools, addressing a critical gap in current safety evaluations. Announced on October 14, 2024, AgentHarm is comprehensive, reliable, and easy to run, allowing for widespread use. The initiative highlights the need for robust safety mechanisms as LLM agents become more integrated with external systems. Jailbreaking transfers to LLM agents without degrading capabilities, and the dataset is partly public.
View original story
Yes • 50%
No • 50%
United States • 25%
European Union • 25%
China • 25%
Other • 25%
Google • 25%
Microsoft • 25%
OpenAI • 25%
Other • 25%
NeurIPS 2024 • 25%
ICML 2025 • 25%
AAAI 2025 • 25%
Other • 25%
OpenAI • 50%
Anthropic • 50%
Google • 25%
OpenAI • 25%
Meta • 25%
Other • 25%
Apple • 25%
OpenAI • 25%
Amazon • 25%
Alphabet • 25%
Yes • 50%
No • 50%
Google • 25%
Microsoft • 25%
Meta • 25%
None of the above • 25%
Salesforce • 25%
Microsoft • 25%
Oracle • 25%
IBM • 25%
Yes • 50%
No • 50%
No • 50%
Yes • 50%
Technology • 25%
Healthcare • 25%
Other • 25%
Finance • 25%