Introduction
What is Protoface?
Section titled “What is Protoface?”Protoface is a developer-facing realtime avatar API. You call
POST /v1/sessions, stream audio in, and a talking-head avatar speaks that
audio back as video over LiveKit (WebRTC).
Think “Tavus / Simli / HeyGen-style avatars, as an API.” You stay in control of your media transport — Protoface schedules each session onto a GPU worker that runs the avatar model and publishes video into a LiveKit room you own.
How it works
Section titled “How it works”-
Create a session
Send
POST /v1/sessionswith anavatar_idand atransportconfig. The API responds in under 300 ms — the session is queued, not yet running. -
A worker picks it up
The control plane schedules your session onto an available GPU worker, which joins your LiveKit room and starts publishing a video track.
-
Stream audio in, get video out
You push audio into the room; the avatar speaks it back as a video track in realtime. First frame typically lands within a few seconds.
-
Observe and end
Poll
GET /v1/sessions/{id}for status and timing, read aggregate consumption fromGET /v1/usage, and callPOST /v1/sessions/{id}/endwhen you’re done.
Core concepts
Section titled “Core concepts”- Session — a single realtime avatar run. The headline resource. See Sessions.
- Avatar — the face the session renders. v0 ships platform demo avatars
(e.g.
av_demo). See Avatars. - Transport — how media flows. Today that’s LiveKit BYO (you own the room). See Transports.
- Usage — aggregated, read-only consumption per org. See Usage.
Bring your own LiveKit
Section titled “Bring your own LiveKit”Protoface is BYO transport: you own the LiveKit room and mint the worker token with your own LiveKit API key and secret. We never see or store your LiveKit credentials — the worker joins with a token you minted and scoped. This matches the universal convention across LiveKit Agents avatar plugins. See Transports for the full contract.
v0 status
Section titled “v0 status”Protoface is at v0: the full platform runs against a deterministic mock avatar runtime while the production face model is finalized. The API, session lifecycle, transports, limits, and SDKs are all stable and final — the pixels are placeholder. See Known limitations for the exact caveats.