I want to create a simple Speech-to-Text agent that I can interact with on agents-playground, but I cannot get it to display the text in the “Chat” window, despite many hours debugging with Claude. The funny thing is, I did get the following code to work once, so seems the code doesn’t work consistently. Below is my code. Are there any way I can tweak it to make it work?
Version info:
livekit 1.1.2
livekit-agents 1.4.3
livekit-api 1.1.0
livekit-protocol 1.1.2
agents-playground: 59c5300
#!/usr/bin/env python3
import logging
import os
from livekit import agents
from livekit.agents import AgentServer, AgentSession, Agent, room_io
STT_GREETING = os.getenv("STT_GREETING", "STT agent is ready. Speak to transcribe.")
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(
instructions="You transcribe incoming audio using Azure Speech STT."
)
server = AgentServer()
@server.rtc_session()
async def my_agent(ctx: agents.JobContext):
# Configure STT with either endpoint or region
session = AgentSession(
stt=azure.STT(...),
)
await session.start(
room=ctx.room,
agent=Assistant(),
room_options=room_io.RoomOptions(
audio_input=True,
audio_output=False, # No TTS/audio output
text_input=False,
# Use TextOutputOptions instead of AudioOutputOptions
text_output=room_io.TextOutputOptions(),
),
)
if STT_GREETING:
print(STT_GREETING)
if __name__ == "__main__":
agents.cli.run_app(server)
I ran python script.py dev and the transcription did appear in the console, but just not in Playground. Any advice?
Many thanks for your help!