Loading...
Loading...
Browse all stories on DeepNewz
VisitWhat will be Nemotron 70B's most significant achievement by end of 2024?
Adoption in top AI research labs • 25%
Integration in top tech companies' products • 25%
Winning a major AI benchmark contest • 25%
Other • 25%
Industry reports, academic papers, and AI conferences
Nvidia Quietly Releases Open-Source Nemotron 70B, Surpassing GPT-4o and Claude 3.5 Sonnet
Oct 16, 2024, 02:54 PM
Nvidia has quietly released an open-source AI language model, Nemotron 70B, which is a fine-tuned version of Llama 3.1 70B. According to benchmarks, Nemotron 70B outperforms OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet on several tests, including Arena Hard, AlpacaEval 2 LC, and MT-Bench. On the Arena Hard benchmark, Nemotron scored 85.0, surpassing GPT-4o's 79.3 and Claude 3.5 Sonnet's 79.2. In the AlpacaEval 2 LC test, it achieved a score of 57.6, compared to GPT-4o's 57.5 and Claude 3.5 Sonnet's 52.4. On MT-Bench, Nemotron scored 8.98, outperforming GPT-4o's 8.74 and Claude 3.5 Sonnet's 8.81. The model was trained using Reinforcement Learning from Human Feedback (RLHF) with the REINFORCE algorithm and HelpSteer2-Preference prompts. Nvidia has made the instruct model, reward model, and training data available for free on Hugging Face.
View original story
Yes • 50%
No • 50%
Progress to Phase 2 • 25%
Significant efficacy results • 25%
Safety concerns emerge • 25%
No significant milestone • 25%
Sales exceed 10,000 units • 25%
Integration in 100+ companies • 25%
Featured in a major film or series • 25%
None • 25%
Cited in over 100 research papers • 25%
Used in over 10 clinical trials • 25%
Partnership with a top 10 pharmaceutical company • 25%
None of the above • 25%
Initial construction completed • 33%
First production output • 33%
Secured major European client • 34%
AI capabilities • 25%
Safety features • 25%
Affordability • 25%
Design and usability • 25%
Pass Turing Test • 25%
Perform complex task • 25%
Commercial sale • 25%
No significant milestone • 25%
Web-based tasks • 25%
Gameplay advice • 25%
In-depth research • 25%
Wearable technology • 25%
New clinical trial initiation • 25%
Partnership with another pharma company • 25%
FDA approval • 25%
Acquisition by Lilly • 25%
First commercial flight • 25%
First military application • 25%
First passenger service • 25%
None of the above • 25%
Search Enhancements • 25%
YouTube Content Management • 25%
Android Integration • 25%
Other • 25%
No • 50%
Yes • 50%
Healthcare • 25%
Education • 25%
Entertainment • 25%
Finance • 25%