Loading...
Loading...
Browse all stories on DeepNewz
VisitWhat will be the primary application of Grass Network's UpvoteWeb-24-600M dataset by end of 2024?
Natural Language Processing (NLP) • 25%
Recommendation Systems • 25%
Content Moderation • 25%
Other • 25%
Reports from AI research publications, tech news outlets, and company press releases
Grass Network on Solana Open-Sources 600 Million Reddit Posts for AI Training
Jul 4, 2024, 03:51 AM
Grass Network, the data layer of AI on Solana, has open-sourced a dataset containing 600 million top Reddit posts and comments from 2024. This dataset, named UpvoteWeb-24-600M, includes media links and reply lineage, and has been anonymized to preserve user privacy. The data, gathered by 2 million nodes globally in just one week, aims to make AI training more accessible for developers, leveling the playing field with centralized model training sets. This marks a significant milestone for the Grass ecosystem and the broader AI community.
View original story
Rewards for unused internet • 25%
Data scraping for AI models • 25%
Staking and governance • 25%
Other • 25%
Healthcare • 25%
Finance • 25%
Education • 25%
Other • 25%
Healthcare • 25%
Autonomous Vehicles • 25%
Retail • 25%
Other • 25%
Running classic software • 25%
Playing retro PC games • 25%
Educational purposes • 25%
Other • 25%
Content Creation • 25%
Virtual Reality • 25%
Augmented Reality • 25%
3D Printing • 25%
Gaming • 25%
Financial Services • 25%
Supply Chain Management • 25%
Other • 25%
Generating Wikipedia Articles • 25%
Academic Research • 25%
News Article Generation • 25%
Other • 25%
Entertainment • 25%
Education • 25%
Virtual Reality • 25%
Other • 25%
Enterprise transcription services • 25%
Real-time translation services • 25%
Voice-controlled applications • 25%
Other • 25%
Healthcare • 25%
Finance • 25%
Retail • 25%
Other • 25%
Content Creation • 25%
Customer Support • 25%
Data Analysis • 25%
Entertainment • 25%
Anime-style graphics • 25%
Iconic image enhancement • 25%
Personal photo enhancement • 25%
Other • 25%
Yes • 50%
No • 50%
No • 50%
Yes • 50%
No • 50%
Yes • 50%
Anonymization • 25%
Accessibility • 25%
Scalability • 25%
Quality of Data • 25%