LMCache Lab @lmcache

🧪 Open-Source Team that maintains LMCache and Production Stack 🤖 Democratizing AI by providing efficient LLM serving for ALL lmcache.ai Github, Online Joined September 2024

Tweets

116
Followers

655
Following

47
Likes

222

LMCache Lab @lmcache

2 days ago

LMCache highlighted by CEO of Redis @rowantrollope at Redis Released SF 2025! 🎉 We’re thrilled to partner with Redis, bringing KV cache acceleration to the infra ecosystem. #Redis #LMCache #AIInfra #LLM #Caching #SFTech #RedisReleased2025 📌 PS: Our team @kobe_eee(Kobe) &…

0 4 16 414 2

Download Image

LMCache Lab @lmcache

4 days ago

Join us at SIGCOMM 2025(conferences.sigcomm.org/sigcomm/2025/t…) for our full-day LMCache Tutorial — an intelligent caching middleware that makes LLM inference faster & cheaper! 📅 Sept 8, 2025 8:45 AM – 6:00 PM (Portugal Time / WEST) = 12:45 AM – 10:00 AM (PDT) What you’ll learn: 🔹 KV-cache…

0 4 19 578 3

Download Image

LMCache Lab @lmcache

5 days ago

Had a great time at AI Infra & Agent Meetup luma.com/kzc4ang8 hosted in our foster city office! See you next time!

0 2 3 404 0

Download Image

EmbeddedLLM @EmbeddedLLM

a week ago

@lmcache in @vllm_project Singapore meetup!

0 3 11 566 3

Download Image

LMCache Lab @lmcache

2 weeks ago

🚀 Exciting to see LMCache x Mooncake being discussed at the vLLM Shanghai Meetup! The ecosystem around vLLM is evolving fast — from distributed inference to hardware optimizations — and cache innovations like this will be key to unlocking the next level of efficiency &…

vLLM @vllm_project

2 weeks ago

4 32 178 10K 32

Download Image

0 1 7 415 2

LMCache Lab @lmcache

2 weeks ago

Thanks to Alex @ New Port AI invite us to introduce lmcache at the Bay Area Generative AI Builders Meetup! More details here: lu.ma/5sdoeg1y

0 1 5 272 0

LMCache Lab @lmcache

3 weeks ago

Mark your calendars! Excited for the first FastAGI meetup featuring incredible speakers on AI infra & agents 🚀 Looking forward to the discussions and energy at LMCache Lab!

Hunter Zhang, Ph.D @zhzHNN

3 weeks ago

Mark your calendars! Excited for the first FastAGI meetup featuring incredible speakers on AI infra & agents 🚀 Looking forward to the discussions and energy at LMCache Lab!

0 1 0 273 0

Download Image

0 0 1 194 0

LMCache Lab @lmcache

3 weeks ago

love how Ashutosh frames the problem, ask us any questions about kvcache!

Ashutosh Maheshwari @asmah2107

3 weeks ago

love how Ashutosh frames the problem, ask us any questions about kvcache!

4 9 166 36K 167

1 0 7 5K 2

Sumanth @Sumanth_077

3 weeks ago

Fastest inference engine for LLMs! LMCache is an LLM serving engine that reduce Time to First Token (TTFT) and increase throughput, especially under long-context scenarios. 100% Open Source

10 132 699 48K 661

Download Image

LMCache Lab @lmcache

3 weeks ago

8 KV-Cache Systems You Can’t Afford to Miss in 2025 By 2025, KV-cache has evolved from a “nice-to-have” optimization into a critical layer for high-performance large language model (LLM) serving. From GPU-resident paging tricks to persistent, cross-node cache sharing, the…

1 16 70 4K 51

Download Image

Red Hat AI @RedHat_AI

4 weeks ago

We're thrilled to share an integration between KServe and @_llm_d_, bringing powerful, scalable LLM serving to @kubernetesio. Our @RedHatAI team is integrating llm-d, a Kubernetes-native distributed inference framework, into KServe. This is all about combining the best of both…

0 4 8 877 1

LMCache Lab @lmcache

3 weeks ago

Looking forward to seeing LMCache on stage at Cloud Native K8s AI Day! 🚀 Don’t miss the KServe Next session for cutting-edge LLM serving insights.

Yuan (Terry) Tang @TerryTangYuan

3 weeks ago

Looking forward to seeing LMCache on stage at Cloud Native K8s AI Day! 🚀 Don’t miss the KServe Next session for cutting-edge LLM serving insights.

1 4 23 2K 4

0 0 11 437 1

LMCache Lab @lmcache

4 weeks ago

CacheGen(arxiv.org/abs/2310.07240) lets you store KV caches on disk or AWS S3 and load them way faster than recomputing! Modern LLMs use long contexts, but reprocessing these every time is slow and resource-intensive. While engines like vLLM (and LMCache) can cache contexts in…