Loading...
Loading...
Browse all stories on DeepNewz
VisitMIT Researchers Introduce ContextCite, LongCite-8B and LongCite-9B for Enhanced Language Model Attribution
Sep 4, 2024, 09:53 PM
Researchers from MIT, including B Cohen-Wang, H Shah, K Georgiev, and A Madry, have introduced a new model called ContextCite, aimed at improving the attribution of language model generations to specific parts of the provided context. The model learns a surrogate that approximates how a language model's response is affected by including or excluding each part of the context. This innovation addresses the challenge of fine-grained in-line citations in long-context scenarios, which current long-context language models struggle with. ContextCite synthesizes a large-scale supervised fine-tuning (SFT) dataset with off-the-shelf language models to enhance citation generation in long-context question answering (QA). The results indicate that ContextCite's models, LongCite-8B and LongCite-9B, outperform GPT-4o by 6.4% and 3.6% in citation F1 score, respectively, and offer 2x finer citation granularity compared to proprietary models. Additionally, there is a 7-9% improvement in response correctness.
View original story
Markets
Yes • 50%
No • 50%
Peer-reviewed journal publications and citations databases such as Google Scholar or PubMed
No • 50%
Yes • 50%
Official announcements from major commercial language model platforms such as OpenAI, Google, or Microsoft
Yes • 50%
No • 50%
Independent benchmarks and research papers comparing language models
MIT • 25%
Other • 25%
Microsoft • 25%
Google • 25%
Peer-reviewed journal publications and citations databases such as Google Scholar or PubMed
Other • 25%
OpenAI • 25%
Google • 25%
Microsoft • 25%
Official announcements from major tech companies
LongCite-9B • 25%
GPT-4o • 25%
ContextCite • 25%
LongCite-8B • 25%
Results from major independent benchmarks and research papers