Hi all — I’m building a multi-tenant PSTN voice agent and want to validate whether my architecture is the right approach or if I’m missing something.
Scenario
I have one deployed agent (pstn-voice-agent) handling inbound SIP/PSTN calls. Each call can belong to a different account/business, and the agent should respond differently based on
who’s being called.
How it works
Inbound PSTN Call
│
▼
┌─────────────────────┐
│ LiveKit SIP Trunk │
│ Routes to Room │
└────────┬────────────┘
│
▼
┌─────────────────────┐
│ Single Agent │
│ (pstn-voice-agent) │
│ │
│ 1. Extract SIP │
│ attributes: │
│ - callerPhone │
│ - calledPhone │
│ - trunkId │
│ - sipCallId │
│ │
│
▼
┌─────────────────────┐
│ External API Call │
│ (our backend) │
│ │
│ POST /call-config │
│ → Returns: │
│ - system prompt │
│ - model name │
│ - voice │
│ - temperature │
│ - start message │
└────────┬────────────┘
│
▼
┌─────────────────────┐
│ Create Session │
│ with per-call │
│ config │
│ │
│ - RealtimeModel │
│ (dynamic prompt, │
│ voice, model) │
│ - Greet caller with │
│ account-specific │
│ start message │
└─────────────────────┘
Key design decisions
- Single agent, not one per account — the agent is stateless; all personalization comes from the external API response at call start
- Config fetched at runtime — prompt, voice, model, and greeting are all dynamic per call
- SIP attributes as the lookup key — calledPhone + callerPhone tell our backend which account and config to load
What I want to validate
- Is this the correct pattern for multi-tenant voice agents on LiveKit, or should I be spinning up separate agents per account?
- Are there scalability concerns with a single agent handling all calls but with different configs?
- If the external API call fails or is slow, is disconnecting the room the right fallback, or is there a better pattern (e.g., a default fallback prompt)?
- Anything else I’m missing for production readiness?
Any feedback appreciated!