Start demo and say "Hey Assistant"
Your audio never leaves this page. The model highlights "Hey Assistant" while ignoring all other speech — including similar-sounding words and phrases like "Hey sister" or "Assist me".
"Hey Assistant"
Measure the real-time factor on this device
RTF (real-time factor) is processing time divided by audio duration — an RTF of 0.01 means one second of audio takes 10 ms to process, so keeping up with live speech uses about 1% of one CPU core. This benchmark generates 60 seconds of audio right in your browser, pushes it through the wake-word engine as fast as your CPU allows, and reports the measured RTF.
For reference, VoxRT's published browser numbers on this same model: 0.16% on a MacBook Pro M4 (Chrome), 0.23% on an iPhone 13 Pro Max (Safari), 1.17% on a Snapdragon 662 (Chrome Android). Native NEON builds run faster still.
How it works: the ~170 KB WebAssembly runtime and ~100 KB model load from a CDN, then all inference runs on your CPU in this tab. Detection accuracy matches the native SDKs (ROC AUC 0.9966; precision 0.993 / recall 0.982 at the 0.90 threshold). Requirements: a browser with WebAssembly SIMD128 — Chrome/Edge 91+, Firefox 89+, Safari 16.4+ — and microphone permission. Lower the threshold for easier triggering (more false positives); raise it for stricter matching.