Gesture Mappings
Thumbs Up
Thumbs Down
Open Palm
Auto-Describe
Interval
Current Scene
Click "Auto-Describe" to start AI scene analysis...
Activity Feed
No activity yet
Auto-Switch Enabled
Switching Rules
No rules configured
Current Scene
LIVE:
Not Connected
Last Trigger
No triggers yet
Whisper Model
Not loaded
First load downloads ~40MB model (cached for future use). Uses WebGPU acceleration when available.
Live Transcript
Start voice detection to see transcriptions...
Audio is processed in 5-second chunks. Speak clearly for best results.
Voice Trigger Rules
How it works: Define phrases that trigger OBS actions. When Whisper transcribes matching text, the action fires automatically.
"switch to camera two"
"start recording"
"go to wide shot"
No voice triggers configured
Transcript History
No transcripts yet
Last Voice Trigger
No triggers yet
📚 Learning Notes
Whisper vs Web Speech API: Unlike browser speech recognition, Whisper runs the full neural network locally. More accurate, works offline after model loads, but requires more processing power.
WebGPU Acceleration: Modern browsers with WebGPU support run Whisper ~10x faster using your GPU. Falls back to WebAssembly (CPU) automatically.
Trigger Matching: Phrases are matched as substrings. "camera" will match "switch to camera two" and "camera one please". Be specific to avoid false triggers.
Privacy: Audio is processed entirely in-browser. Nothing is sent to any server. The Whisper model runs on your machine.