Setting elevenlabs TTS voice speed with LiveKit Inference

I am trying to configure the speed for an elevenlabs voice using inference.TTS in Node. The TypeScript types only document the properties inactivity_timeout and apply_text_normalization, but the documentation ( ElevenLabs TTS | LiveKit Documentation ) suggests the provider-specific model options list is incomplete: “Additional parameters to pass to the ElevenLabs TTS API, including [the two values actually in the TypeScript type]. See the provider’s documentation for more information.” That provider docs link is just vaguely to the entire elevenlabs docs site, not any specific API reference.

I searched the Web and Slack to try to answer this question. Slack turned up a thread where the bot indicated that speed could be supplied in modelOptions.I have now tried this, and I am getting a surprisingly unhelpful LiveKit Cloud error message back: the entire serverEvent is { “type”: “error” } with no further information to assist in debugging the error. Guess-and-check is quite slow since I need to relaunch the worker after editing and wait for the Cloud to stop dispatching jobs to the old worker.

  • Are additional options actually supported as the documentation suggests, or are only the ones in the TypeScript types supported, and the rest fail some sort of validation?
  • Is there some way to specify the voice speed with inference.TTS and elevenlabs as provider?

(On a side note, the Node API seems to have renamed the Pythonic extra_kwargs to modelOptions for the top-level config, but not within the fallback model config, where it’s extraKwargs.)

Edited to add: It looks like the documented modelOptions are drawn from the query parameters for the ElevenLabs WebSocket endpoint. The speed (among other options) is instead supposed to be provided in the voice_settings property of the initial message sent upon establishing the WebSocket connection. Supplying modelOptions.voice_settings.speed seems to have no impact on the generated speech, but it also doesn’t trigger an error, unlike supplying modelOptions.speed.

I don’t believe that field is currently exposed on the inference interface. I will raise it with the team. You could use plugins and go directly if it is urgent.

Thanks for confirming. The docs make it sound like other flags might just be passed through (which would certainly be convenient).

A general voice_settings model option would be great. I am also looking to customize similarity_boost and stability.

Sounds good. We tested in Python, and it worked fine. So, checking with the Agent JS team on it.

Once merged this PR should fix it.

1 Like