Voice Triggers

Module 15 - Speech-to-text automation with Whisper AI

← Back to Visual Reasoning Playground

No API Key Required for Speech Recognition

Whisper AI runs entirely in your browser using WebGPU acceleration. The ~40MB model downloads once and is cached locally. Your voice never leaves your device.

🎤 Microphone Input

Click "Start Listening" to begin

📝 Live Transcripts

Transcriptions will appear here when you speak...

🧠 Whisper Model Status

Model Not Loaded
Click Start to load the Whisper model
Ready to start voice recognition

⚡ Voice Trigger Rules

How Triggers Work

When the transcribed speech contains your trigger phrase, the action will be logged to the reasoning console. Connect to OBS for scene switching.

No trigger rules defined yet

⚙️ Settings

5s
5s

About Whisper AI

OpenAI's Whisper is a state-of-the-art speech recognition model. This tool uses the "tiny.en" variant (~40MB) optimized for English, running via Transformers.js with WebGPU acceleration (falls back to WASM if needed).