DeepNewz Markets

Markets Stories

Search

Loading...

Browse all stories on DeepNewz

Market

Which model will see the highest adoption of prompt caching by end of 2024?

2

Anthropic•Opus

Resolution / Starting Odds

Claude 3 Haiku • 25%

Opus • 25%

3.5 Sonnet • 25%

Other • 25%

Usage statistics or reports from Anthropic

Story

Anthropic Launches Prompt Caching with 90% Cost and 80% Latency Reductions

Aug 14, 2024, 04:49 PM

Anthropic has introduced a new feature called prompt caching in its API, currently available in beta. This feature significantly reduces the costs and latency associated with AI model responses. By storing and reusing context, prompt caching can cut API input costs by up to 90% and reduce latency by up to 80%. This development is particularly beneficial for applications involving long, static instructions, as it allows for more efficient processing. The prompt caching feature is designed to improve the performance of large language models (LLMs) and is expected to have a substantial impact on applications such as Retrieval-Augmented Generation (RAG). The pricing model for Anthropic's caching involves charges for cache writes, with a cache lifetime of five minutes that refreshes each time the cached content is used. The feature supports Claude 3 Haiku, Opus, and 3.5 Sonnet.

View original story

Similar markets

Which AI model will have the highest market adoption by June 30, 2025?

OpenAI's O1 model • 25%

GPT-4 • 25%

Gemini • 25%

Anthropic's Claude • 25%

Which region will have the highest adoption of IBM's Granite 3.0 models by end of 2024?

North America • 25%

Europe • 25%

Asia-Pacific • 25%

Latin America • 25%

Which models will be most frequently used by RouteLLM by the end of 2024?

GPT-3.5 • 25%

GPT-4 • 25%

Other LLMs • 25%

Hybrid use of multiple models • 25%

Which company will see the highest user adoption rate of GPT-4o mini by end of 2024?

Microsoft Teams • 25%

HubSpot • 25%

Discord • 25%

Slack • 25%

Leading AI model in market adoption by end of 2024?

Claude 3.5 • 25%

GPT-4 • 25%

Gemini Pro • 25%

Llama • 25%

Which quantized AI model will see the highest adoption by December 31, 2024?

Meta's Llama 3.1-70B • 25%

OpenAI's GPT-4 quantized • 25%

Google's Bard quantized • 25%

Other • 25%

Which Llama model will have the highest adoption rate among enterprises by mid-2025?

Llama 3.1 405B • 25%

Llama 3.1 8B • 25%

Llama 3.1 70B • 25%

Other • 25%

Which region will see the highest adoption of GPT-4o Mini by end of 2024?

North America • 25%

Europe • 25%

Asia-Pacific • 25%

Other regions • 25%

Which region will have the highest adoption rate of GPT-4o Mini by June 30, 2025?

North America • 25%

Europe • 25%

Asia-Pacific • 25%

Other • 25%

Which region will have the highest adoption of GPT-4o mini by end of 2024?

North America • 25%

Europe • 25%

Asia • 25%

Other • 25%

Which region will have the highest adoption of Nvidia's Fugatto AI model by June 30, 2025?

North America • 25%

Europe • 25%

Asia • 25%

Other • 25%

What will be the adoption rate of Gemini 1.5 models compared to previous versions by end of 2024?

Less than 20% • 25%

20%-40% • 25%

40%-60% • 25%

More than 60% • 25%

Markets based on same story

Loading...

Looking for markets...

Show all

Will 50% of Anthropic's top 10 clients adopt prompt caching by mid-2025?

Yes • 50%

No • 50%

Will Anthropic's prompt caching be generally available by the end of 2024?

No • 50%

Yes • 50%

Will Anthropic's prompt caching reduce API input costs by at least 85% by end of 2024?

No • 50%

Yes • 50%

What will be the average latency reduction from Anthropic's prompt caching by end of 2024?

Less than 70% • 25%

More than 80% • 25%

75% to 80% • 25%

70% to 75% • 25%