Prewarm connections (LLM, TTS) when not using welcome message

Hi,

We have use cases where we initiate outbound calls from agents to users. This happens without a welcome message, because in real world scenario when you pick up the phone you speak first and than the person who calls speaks second.

So what happens in our case:

  1. Agent calls
  2. User picks up
  3. User says: “hello this is John”
  4. Agent speaks: “hi this is assistant blablabla”

The issue we faced here, is that the LLM and TTS connections were not warmed up; so between step 3 and 4 it takes up to 3.5 seconds in worst cases.

What usually happens with the on_enter approach with welcome message, when the user calls the agent. A welcome message is spoken and the LLM and STT connections are setup and ready.

So for this use case we have built a function for our outbound calls that prewarms the TTS and LLM when the session starts, without a welcome message. I have not seen something standard out of the box in place for livekit agents for this use case, but might be beneficial to include something like this in the framework in the near future.

Through this approach we have reduced the time from step 3 to step 4 from 3.5s to 1.2s-1.4s

Have your tried setting preemptive_generation=True?

@CWilson I have, but it isn’t sufficient, with preemtive generation you gain a little advantage, but not enough.

Another option is to use session.say() rather than rely on the LLM to generate the initial greeting: Agent speech and audio | LiveKit Documentation

Hi @darryncampbell we prefer to keep the dynamic part, since we generate first response based on agent instructions and prefer to keep that in this approach as well.

We found a way to also do this, without just prewarming TTS and LLM connection. That also helps.

1 Like