AWS credentials we pass to the egress request are STS-derived (per our IAM config the session should be valid for 24 hours; we don’t have visibility into the actual effective TTL of the credentials by the time Egress uses them)
What happened
A call lasted ~80 minutes. Egress recorded the audio. When Egress tried to upload to S3 at the end (CreateMultipartUpload), AWS returned ExpiredToken and the recording was lost.
Egress does not modify the TTL of the token, so I assume the effective TTL of the credentials was < 80m.
It’s not my area of expertise, but others have used assume_role_arn to avoid this kind of issue with long-running sessions. This setting has high-level documentation here: Egress API | LiveKit Documentation but you would need to contact support to enable that. I think this setting is only available on Scale and Enterprise.
This is a sample on how we trigger an egress request where we pass a session token generated at that time based on the IAM policy attached to the pod. The IAM policy has a 24 hr ttl for token in our normal deployment.
For the assume_role_arn solution, can you please help share how we can enable and approach this? We are on Scale plan, our project Id - p_3tqm7ro6kbs
@Zaheer_Abbas, I think the root cause is that the session_token you pass is a static snapshot of your pod’s IRSA credentials taken at egress-request time. AWS web-identity (IRSA) sessions default to ~1 hour regardless of the role’s 24h MaxSessionDuration, unless the SDK explicitly requests longer. Egress holds that snapshot and can’t refresh it, so an 80-min call blows past the 1h expiry.
assume_role_arn fixes it structurally: instead of a static token, Egress makes an AssumeRole call at upload time to mint fresh credentials [ livekit/protocol/protobufs/livekit_egress.proto, S3Upload.assume_role_arn ]. The proto notes it’s “only available on accounts that have the feature enabled,” which is why Darryn routed you to support. You’d pass assume_role_arn + assume_role_external_id plus a base credential authorized to assume that role; support will confirm the exact trust setup on Scale (reference p_3tqm7ro6kbs).
For the lost recording (EG_LqDrKVyspwm9): a failed upload isn’t recoverable through the Egress API, so ask support in the same ticket whether anything was retained server-side, though realistically that file is gone.
@darryncampbell - can you please confirm that we can use both access token method and arn token method for the same account at a given time right? NOT for the same egress request, I am meaning to ask if we can use both access token method for one call and arn token method for a different call. Is that allowed by LiveKit cloud for a given project?
Yes, you should be able to use both methods got separate calls, however I want to correct something I said previously. This feature is only available to enterprise customers, not scale - apologies for that error.