Agent Pricing End to End

Hi Team,

I want to confirm the actual cost of my agent sessions on the Scale plan ($500/mo). Could you please help me clarify the following:

1. Agent Session Minutes 50,000 minutes included, then $0.01 per minute. :white_check_mark: I understand this.

2. Concurrent Agent Sessions 600 concurrent sessions included. :white_check_mark: I understand this.

3. Speaker Isolation 10,000 minutes included. I am not sure if this applies to my setup or not. Could you confirm where and when speaker isolation is applied?

4. Agent Session Recordings 50,000 minutes included. Could you confirm that the Scale plan includes 50,000 minutes of recording for free and then charges $0.005 per minute after that?

5. WebRTC Minutes 1.5M minutes included. I have 3 participants in a room: Agent, Candidate, and Egress. For a 10 minute session, does LiveKit calculate WebRTC usage as 3 participants × 10 minutes = 30 minutes?

6. Transcode Minutes 8,000 minutes included. After 8,000 minutes, does LiveKit charge $0.004 per minute for audio-only transcoding?

7. Track Egress 600 minutes included. I am not sure if I am using this or not. Could you confirm what track egress is and whether it applies to my setup?


My Cost Estimate for 50,000 Agent Minutes:

Based on my understanding, here is how I calculate the cost if I run my agent for 50,000 minutes:

  1. Agent session minutes — 50,000 minutes included in the plan. No extra charge.

  2. Agent session recordings — 50,000 minutes included in the plan. No extra charge.

  3. WebRTC minutes — 3 participants × 50,000 minutes = 150,000 minutes. This is within the 1.5M minutes included. No extra charge.

  4. Transcode minutes — Each room has one egress recording in DUAL_CHANNEL_AGENT format, so total transcode usage = 50,000 minutes. The plan only includes 8,000 minutes, so I would need to pay for the remaining 42,000 minutes at $0.004 per minute = $168 extra.

Could you confirm if my understanding is correct and let me know if I am missing anything?

Setup:

  1. I am creating a room
  2. Attaching egress with DUAL_CHANNEL_AGENT
  3. One participant joins the room, so a maximum of 3 participants can exist in a room

That’s all correct, some other points:

  • Speaker isolation is related to ai-coustics voice focus model. Search this page Noise & echo cancellation | LiveKit Documentation for the term ‘QUAIL_VF_L‘ for more information.
  • You haven’t included any inference costs in your calculation, so presumably you are integrating with model providers using your own keys
  • You haven’t included ‘downstream data transfer’ in your calculations, but it’s likely you are within the 3TB inclusive limit for scale
  • You haven’t included agent observability in your calculations, but I suspect you would be fine with the included 5M entries.
3 Likes

Also, just to clear, 50000 minutes includes call initiated. For example, if Agent started the call but user doesn’t pick up it will still be counted as 1 min. Also, if the call ends in 10 seconds, it will be counted as 1 min.

2 Likes

Thanks for the clarification

Did you already get to try ai-coustics Voice Focus? Would love to hear your opinion

No, I didn’t try yet

1 Like