How to cache an image on the LLM side

Hey, I’m trying to “multithread” images to a VLM. What i’d liketo do: grab a VideoFrame triggered by user starting speech, and immediately send it to a VLM, with a dummy prompt if necessary. Then in on_user_turn_completed, I send the same exact image “again” with the actual user prompt appended this time, but now the VLM has cached the image. Is this achievable?

I think this should “just work”, trying to send the image again it should be recognized as the same (cached) image and the LLM should use its cache. You should see this reflected in the prompt_cached_tokens in the LLM metrics, https://docs.livekit.io/deploy/observability/data/#metrics

1 Like