60-second multimodal triage
Talk to Pulse
A browser session opens a direct WebSocket from the user device to the Gemini 3 Flash Live preview endpoint. The Pulse backend mints a one-use ephemeral token via @google/genai authTokens.create with liveConnectConstraints locked to AUDIO modality + system instruction; GOOGLE_API_KEY never leaves Cloud Run.
16 kHz mono PCM mic via AudioWorklet (resampled from the device-native 48 kHz), 1 Hz JPEG frames from the same MediaStream. Gemini emits 24 kHz PCM back which we play through a queued AudioContext for gapless playback. rPPG runs in parallel on the same video stream so the heart-rate estimate lands at the same moment the model finishes speaking.
gemini-3.1-flash-live-preview · audio-only response · 60s cap