DeepNewz Markets

Market

Will XBOW AI pentester achieve a perfect score on a web pentesting benchmark by June 2025?

GitHub Copilot•XBOW•Oege de Moor•PentesterLab•PortSwigger

Resolution / Starting Odds

Yes • 50%

No • 50%

Official benchmark results published by PentesterLab or PortSwigger

Story

XBOW AI Pentester Matches Human Experts, Achieves 85% Success in 28 Minutes

Aug 5, 2024, 02:25 PM

The team behind GitHub Copilot has launched XBOW, an AI-powered penetration tester that rivals human experts. XBOW, led by Oege de Moor, has demonstrated remarkable capabilities by matching the performance of a 20-year veteran pentester in just 28 minutes, achieving 85% success in identifying vulnerabilities. The AI tool scored an unprecedented 75% on renowned web pentesting benchmarks from PentesterLab and PortSwigger. In a head-to-head competition, XBOW solved 88 out of 104 challenges, matching the performance of human experts given 40 hours. XBOW's performance has shown that AI can significantly accelerate cybersecurity tasks.

View original story

Similar markets

Sonnet 3.5 • 25%

Other • 25%

Top 10 in a benchmark • 25%

No significant milestone • 25%

Will Big Sleep AI discover another zero-day vulnerability by June 2025?

Yes • 50%

No • 50%

Market

Story

Similar markets

Will xAI surpass Google, OpenAI, and Anthropic in AI benchmarks by end of 2025?

xAI surpasses OpenAI in AI benchmarks by end of 2025?

Will xAI's Grok achieve a breakthrough in AI benchmarks by June 30, 2025?

Which AI model will achieve the highest score in the next Arena Hard benchmark test by Q1 2025?

Will xAI surpass OpenAI in AI capabilities by the end of 2025 according to independent benchmarks?

Will Google's AI Cyber Challenge team win the DARPA AI Cyber Challenge by the end of 2024?

Will Gemini 2.0 Flash Thinking remain top AI benchmark by June 30, 2025?

Will the Army's Cyber AI tool pilot be successfully completed by Nov 18, 2025?

Will Aim Security win a cybersecurity award by the end of 2024?

Will the top-scoring ARC-AGI entry use brute-force program search by November 10, 2024?

Will Pixtral 12B achieve a significant milestone in AI benchmarks by December 31, 2024?

Will Big Sleep AI discover another zero-day vulnerability by June 2025?

Will a Fortune 500 company adopt XBOW AI pentester by March 2025?

Will XBOW AI pentester surpass 90% success rate by end of 2024?

How many challenges will XBOW AI pentester solve in the next head-to-head competition by March 2025?

What will be the adoption rate of XBOW AI pentester among top cybersecurity firms by June 2025?

Will a Fortune 500 company adopt XBOW AI pentester by March 2025?

Will XBOW AI pentester surpass 90% success rate by end of 2024?

How many challenges will XBOW AI pentester solve in the next head-to-head competition by March 2025?

What will be the adoption rate of XBOW AI pentester among top cybersecurity firms by June 2025?