MCP tool definitions bloating prompt tokens and increasing latency in voice agents — how to handle this?

Aman · March 10, 2026, 12:54pm

We’re building a voice agent using LiveKit Agents (Python) and ran into a token bloat problem when integrating MCP servers like Google Calendar.

The issue:
Each MCP tool injects its full schema/definition into the prompt on every LLM call. For a voice agent, latency is critical — ideally we want to stay under ~2k tokens per request. But even adding a single tool like “Create Event” from Google Calendar MCP adds ~3k tokens on its own, which immediately blows past that budget.

What we tried:
We know about the allowed_tools filter to only expose specific tools instead of the full MCP server. We tried this, but even one tool from Google Calendar MCP adds ~3k tokens — so filtering tools alone doesn’t solve the problem.

What we’re looking for:

Is there a way to reduce or compress the tool schema that gets injected into the prompt?
Has anyone written leaner custom wrappers around MCP tools to reduce token footprint?
Is there a pattern others use to balance MCP tool availability vs. latency in voice agents?

Any advice or examples would be really helpful!

CWilson · March 10, 2026, 9:01pm

I guess one solution is to use an agent that supports transparent caching

Raghu_Udiyar · March 11, 2026, 8:16am

If your agent doesn’t need all Calendar functionality, you could create a custom tool that abstracts some of the details, and only injects the minimal schema needed for your Agent.

futadanewson · March 11, 2026, 11:18am

prompt prune with tool calls and sub agents helps

Topic		Replies	Views
Recommended architecture for safe MCP tool execution in LiveKit agents Agents agent-development	1	36	February 17, 2026
I don't understand Getting Started	2	61	April 24, 2026
Using the MCP Gateway Pattern to support many MCP servers Agents agent-development	3	77	February 9, 2026
How to add pre/post say method to MCP tools Agents agent-development	1	32	January 21, 2026
Invalid agent state leads to blocked call Agents agent-development , agent-sdk-node-js , node-js	7	69	February 19, 2026

MCP tool definitions bloating prompt tokens and increasing latency in voice agents — how to handle this?

Related topics