I’ve got a Twilio phone number connected to an IVR phone tree, and a LiveKit Cloud account with one inbound trunk and one dispatch rule. The telephony section of my code returns a TwiML request redirecting to LK SIP, based on this snippet (except using tls instead of tcp for transport).
Sometimes I am able to navigate through the phone tree and reach the agent on the other side, and other times I am not. The Cloud dashboard shows that it recognized all calls made, but only the successful ones are shown to be connected to rooms — despite the dispatch rule being set to Individual, and the expectation being that it’ll connect the caller to a room session.
I’ve got PCAP traces of both successful and unsuccessful tries, and nothing seems to be different apart from the fact the failed calls 180 Ringing endlessly without picking up. I’ve got a wildcard matcher for exactly one dispatch rule and exactly one inbound trunk, so the routing should all be going the same way.
The agent itself is idle and healthy, and sometimes ending an unsuccessful call and calling again makes the agent work. There are no “job received” messages from the agent’s side on failed calls. This is with Cloud, so I’m unable to see the server logs to see if an agent dispatch signal was actually sent or not, but my hunch + the docs LLM both agree that this might be an internal issue that should be escalated to support. Available to give call SIDs, timestamps, PCAPs privately! The LLM told me to mention I am not region pinning.
You should get a 200 response after the agent answers the call. Most commonly you will get an endless 180 ringing because the agent was not dispatched correctly, but if you are getting occasional 200 responses with an identical setup, I suspect something is wrong with your agent setup or initialisation.
Your dispatch rule currently routes to a self-hosted agent, not an agent hosted in LiveKit cloud.
I might be looking at the wrong project however, since all recent sessions in your account have an agent joining
Your dispatch rule currently routes to a self-hosted agent, not an agent hosted in LiveKit cloud.
Yes — what I meant was if I was running livekit-server myself, I’d have access to those logs, to see if the central orchestrator is sending agent dispatch signals. (This is mostly irrelevant, just a curiosity.)
I might be looking at the wrong project however, since all recent sessions in your account have an agent joining
Hmm, this definitely does not seem to be the case when I view the dashboard. Take SCL_wwN5dVrPd2WM, for example: the PCAP trace shows an INVITE and then endless ringing, and the agent-side logs did not show any “failed to join” message etc., simply nothing happened.
There’s a good chance it might be an agent-side issue (maybe something incorrectly cleaned up or not settled in time?) but I thought to check in with you folks to see if maybe you could help from your end.
The “agent joined, then left 1s later” signature plus inconsistent success usually points at one of two things, both checkable without Cloud-side access:
A second worker is registered. In the session events @darryncampbell linked, check which worker dispatched each job. If failed and successful calls go to different workers, a stale process from a previous deploy is round-robining with the healthy one. Kill the orphan.
Entrypoint raises before your logging starts. If AgentSession.start() or plugin init throws, the framework cleans up and the participant leaves within ~1s. A try/except at the top of entrypoint() that logs the exception will surface it.
A less nuclear option (for future reference) is to delete the LiveKit API keys, then create new ones, and add them to your agent.
Also,
There is no substitute for testing through the SIP provider from an actual phone. But one way I like to test my trunk/agent, besides in a browser or on a phone, is to point this soft phone at my trunk directly to isolate possible provider issues from the rest of the system. Not sure it would have helped much in this case, but for future reference.