Loading...
Loading...
Browse all stories on DeepNewz
VisitScale AI Leaderboards Face Bias Criticism by October 2024?
Yes • 50%
No • 50%
Media reports, studies, or public statements from AI ethics experts
Scale AI Launches First LLM Leaderboards with Private Datasets and Paid Annotators
May 29, 2024, 05:41 PM
Scale AI has launched its first LLM Leaderboards, ranking AI model performance across specific domains. This initiative addresses significant issues in current evaluation methods, such as contaminated evaluation sets and inconsistent rater quality. The leaderboards feature private datasets that cannot be trained on and paid annotators to ensure fair and high-quality evaluations. Human expert evaluations are also part of the process. To ensure leaderboard integrity, models can only be featured the first time an organization encounters the prompts. This effort is seen as a crucial step towards improving the evaluation field, providing a trusted resource for assessing AI models. The move has been widely praised by experts, who highlight its potential to enhance the integrity and utility of AI benchmarks.
View original story
Significantly more negative • 25%
Somewhat more negative • 25%
No significant change • 25%
More positive (support for AI innovation) • 25%
Yes • 50%
No • 50%
Natural Language Processing • 25%
Computer Vision • 25%
Healthcare • 25%
Finance • 25%
Medical Diagnosis • 25%
Customer Service • 25%
Content Generation • 25%
Fraud Detection • 25%