Loading...
Loading...
Browse all stories on DeepNewz
VisitHow many challenges will XBOW AI pentester solve in the next head-to-head competition by March 2025?
Less than 80 challenges • 25%
80-90 challenges • 25%
90-100 challenges • 25%
More than 100 challenges • 25%
Official competition results published by GitHub or relevant cybersecurity organizations
XBOW AI Pentester Matches Human Experts, Achieves 85% Success in 28 Minutes
Aug 5, 2024, 02:25 PM
The team behind GitHub Copilot has launched XBOW, an AI-powered penetration tester that rivals human experts. XBOW, led by Oege de Moor, has demonstrated remarkable capabilities by matching the performance of a 20-year veteran pentester in just 28 minutes, achieving 85% success in identifying vulnerabilities. The AI tool scored an unprecedented 75% on renowned web pentesting benchmarks from PentesterLab and PortSwigger. In a head-to-head competition, XBOW solved 88 out of 104 challenges, matching the performance of human experts given 40 hours. XBOW's performance has shown that AI can significantly accelerate cybersecurity tasks.
View original story
Nemotron 70B • 25%
ChatGPT4o • 25%
Sonnet 3.5 • 25%
Other • 25%
Yes • 50%
No • 50%
Outperforms all models • 25%
Outperforms AlphaFold3 but not ESM3 • 25%
Outperforms ESM3 but not AlphaFold3 • 25%
Does not outperform either • 25%
790-793 • 25%
794-796 • 25%
797-799 • 25%
800 • 25%
0-1 benchmarks • 25%
2-3 benchmarks • 25%
4-5 benchmarks • 25%
More than 5 benchmarks • 25%
NeurIPS • 25%
ICML • 25%
AAAI • 25%
Other • 25%
Yes • 50%
No • 50%
International Mathematical Olympiad • 25%
Kaggle Competition • 25%
DARPA Challenge • 25%
Other • 25%
0 • 25%
1 • 25%
2 • 25%
3 or more • 25%
No • 50%
Yes • 50%
30%+ • 25%
20%-30% • 25%
0-10% • 25%
10%-20% • 25%