How to cache an image on the LLM side

Isaac_Huntsman · February 20, 2026, 7:07pm

Hey, I’m trying to “multithread” images to a VLM. What i’d liketo do: grab a VideoFrame triggered by user starting speech, and immediately send it to a VLM, with a dummy prompt if necessary. Then in on_user_turn_completed, I send the same exact image “again” with the actual user prompt appended this time, but now the VLM has cached the image. Is this achievable?

darryncampbell · February 23, 2026, 9:27am

I think this should “just work”, trying to send the image again it should be recognized as the same (cached) image and the LLM should use its cache. You should see this reflected in the prompt_cached_tokens in the LLM metrics, https://docs.livekit.io/deploy/observability/data/#metrics

Topic		Replies	Views
Response.prompt_cache_retention Input should be ‘in-memory’ or ‘24h Agents agent-development , openai	2	28	April 21, 2026
Question with LLM tool calling Agents agent-development	9	125	February 16, 2026
Dangerous assistant turn merging with Gemini Client SDKs python , llm	1	20	February 20, 2026
Text streams missing in front end Getting Started agent-development , agent-deployment	2	15	April 21, 2026
Invalid agent state leads to blocked call Agents agent-development , agent-sdk-node-js , node-js	7	45	February 19, 2026

How to cache an image on the LLM side

Related topics