Which on-device wake-word should you ship?
A wake word is the always-on phrase that brings your app to life — "Hey YourBrand" — detected right on the device so nothing streams to the cloud. This page compares the on-device options — Picovoice Porcupine, openWakeWord and the older baselines — in plain terms: how accurate they are, how they're licensed, and whether you can actually ship them on mobile. VoxRT's edge is simple: an MIT runtime and model, a native Android and iOS SDK, and a tiny always-on footprint. Built-in assistant wake words such as "Hey Siri" and "Hey Google" aren't included — third-party apps can't redistribute them as their own wake-word SDK.
What you get with VoxRT wake-word
Custom phrase tuning
Single words, brand names, or multi-word activations — tuned for your phrase, language, and target devices.
Designed for real-world audio
Tuned against hard negatives and noisy mobile conditions, with thresholds adjustable for your product.
Tunable threshold
Trade precision for recall with one confidence setting tuned to your product.
Battery-aware
Gated by on-device voice activity detection, so it sips power while idle.
Fully on-device
No microphone audio ever leaves the device. No network required — and no per-use fees.
The comparison table
Most wake-word vendors publish desktop or Raspberry Pi numbers. VoxRT publishes measured mobile real-time factor on a Snapdragon 662 and an iPhone — the hardware your users actually carry.
| Engine | Accuracy (vendor-reported) | Speed | Native mobile SDK | Custom keyword | License |
|---|---|---|---|---|---|
| VoxRTwake-word v0.1 | ROC AUC 0.9966 a P 0.993 / R 0.982 @ 0.9 |
RTF 0.021 Snapdragon 662 (A73) |
✓ Android + iOS | Any phrase / language tuned on request |
MIT runtime + weights |
| Picovoice Porcupineclosed commercial SDK | 2.7% miss rate b 1 FA / 10 hr, 10 dB SNR · 6-keyword avg |
0.6% CPU RPi 5 c ~1.8% est. on SD662 |
✓ Android + iOS | Console d paid tier for commercial |
Commercial free plan = eval only |
| openWakeWordmodern OSS | FA <0.5/hr, FR <5% target e "beats Porcupine" on a small set |
Not published 15–20 models real-time / RPi 3 core |
✗ Python only e | Colab + TTS ~1 hr |
Weights CC-BY-NC-SA f code Apache-2.0 |
| Mycroft Preciselegacy baseline · 2019 | Not published | Not published | ✗ Linux / RPi only | Training docs | Apache-2.0 |
| Snowboylegacy baseline · archived 2020 | 31.9% miss rate b | 3.8% CPU RPi 5 b | ✗ discontinued | — | Archived g |
| PocketSphinxlegacy baseline · CMU-Sphinx | 52.0% miss rate b | 12.1% CPU RPi 5 b | ✗ unofficial ports | Phonetic, no training | BSD-style g |
RTF = real-time factor — fraction of one CPU core needed to keep up with audio in real time; lower is better. Accuracy and speed figures are each vendor's own published numbers, measured on different phrases, datasets and hardware, so they are directionally indicative, not directly comparable (see the methodology note below). Sources: a Internal VoxRT measurement on a held-out test set — 5,240 positive utterances + 6,416 hard negatives, speakers disjoint from training; default phrase "Hey Assistant", precision/recall quoted at the 0.9 default threshold. b Picovoice wake-word-benchmark — miss rate at 1 false alarm per 10 hours, 10 dB SNR, six keywords averaged (alexa, computer, jarvis, smart mirror, snowboy, view glass); runtime CPU on Raspberry Pi 5. Figures read from the benchmark's chart images. c Picovoice publishes 0.6% CPU on a Raspberry Pi 5; ~1.8% is a Geekbench-single-core-scaled estimate for a Snapdragon 662 (see methodology). No first-party Android/iOS number is published. d Picovoice Console trains custom keywords; the Free Plan is evaluation-only and commercial deployment requires a paid tier with pricing gated behind sales. e openWakeWord README — a target false-accept <0.5/hr and false-reject <5%, and an "Alexa" model that beats Porcupine on a small test set the authors say to interpret cautiously; ships as a Python package (a community C++ port exists, not first-party). f openWakeWord code is Apache-2.0, but its pretrained models are CC-BY-NC-SA 4.0 (non-commercial) — shipping them in a paid app requires retraining your own model. g Snowboy (KITT.AI) was archived in 2020 and PocketSphinx is CMU-Sphinx-era; both appear as the OSS baselines in Picovoice's benchmark, not as current shipping options.
The technical details
VoxRT's wake-word is a compact neural detector on a mobile-first Rust runtime — a stateless C ABI, shipped today as native Android (JitPack) and iOS (Swift Package) modules, with published mobile real-time factor and a model small enough for microcontrollers next. Below are the measured numbers and the footprint that lands in your app.
The 0.021 real-time factor is the HIGH_PERF, affinity-pinned measurement. Default AUTO mode is built for normal app integration and varies with device scheduling and thermal state.
What it costs in your app
- Swift wrapper source (one file)~7 KB
- Wake-phrase model (fp16)~103 KB
- App-binary delta after extraction + dead-code elimination2–3 MB
The wake-phrase model itself is ~103 KB; the 2–3 MB app delta includes the extracted native runtime and packaging overhead.
How each one really compares
VoxRT
MIT · runtime + weights
VoxRT is the wake-word engine for teams that need a custom, on-device trigger inside a commercial product — without opaque licensing, a cloud dependency, or porting work. MIT runtime and MIT weights, a native Android and iOS SDK today, a ~103 KB model, and a measured mobile real-time factor of 0.021 on a Snapdragon 662. Custom phrases in any language are tuned per customer on request — a paid engagement rather than a self-serve console at v0.1. It wins where a buyer feels pain: license clarity, native mobile shipping, tiny footprint, and published mobile performance.
Picovoice Porcupine
Closed commercial SDK
Picovoice Porcupine is a closed commercial wake-word SDK with broad platform support, a self-serve keyword console, and published benchmark results on six built-in phrases. It doesn't publish cheap-tier Android or iPhone RTF, its model size isn't surfaced in the docs, and commercial custom-keyword deployment requires a paid plan. The useful comparison is practical: Porcupine publishes Raspberry Pi numbers; VoxRT publishes measured Android and iOS mobile RTF, with MIT weights and no commercial runtime lock-in.
openWakeWord
Code Apache-2.0 · weights CC-BY-NC-SA
The modern open-source option, with a clever pretrained-embedding backbone and easy ~1-hour custom training on Colab. The trap is the weights: the pretrained models are CC-BY-NC-SA 4.0 — non-commercial — so you can't ship them in a paid app without retraining your own from scratch. It's also Python-only, with no first-party native mobile SDK.
Legacy baselines
Mycroft Precise · Snowboy · PocketSphinx
Older OSS baselines kept in the table for context only — Mycroft Precise (dormant since 2019), Snowboy (archived 2020) and PocketSphinx (CMU-Sphinx-era) — none something you'd ship in a new product today.
Why the figures aren't apples-to-apples
We'd rather hand you the caveats than a tidy leaderboard. Every accuracy and speed figure above is vendor-self-reported, and no independent academic benchmark covers all of these engines on a common test set.
Accuracy is measured on different phrases. Our ROC AUC 0.9966 is on our custom "Hey Assistant" phrase against a set of hard negatives; Porcupine's 2.7% miss rate is averaged over six built-in keywords; openWakeWord's "beats Porcupine" claim is on its Alexa model on a small dataset its own authors say to read cautiously. Custom-keyword accuracy is consistently lower than built-in for every vendor, so don't read a custom number against a built-in one.
Speed is the axis nobody but us publishes on cheap-tier Android. Picovoice reports 0.6% CPU on a Raspberry Pi 5; scaling by Geekbench single-core ratios (~3×) puts that near 1.8% on a Snapdragon 662, against our measured 2.1% — a gap well inside the device's thermal noise, so the honest read is a near-tie, not an advantage either way. Our 0.021 is a warm, affinity-pinned high-performance figure; a phone's default mode runs a little higher.
What's solid and comparable is the rest of the table: license terms, whether there's a first-party native mobile SDK, and whether a vendor publishes any mobile number at all.
Related primitives
On-device wake-word, answered
What is the best on-device wake-word engine?
There is no independent third-party benchmark that covers every on-device wake-word engine on a common test set, so any single "best" claim is vendor-self-reported. The practical choice comes down to license, whether there is a native mobile SDK, and published mobile performance. VoxRT ships an MIT runtime and MIT weights with native Android and iOS modules, a ROC AUC of 0.9966 on a hard held-out test set, and a measured real-time factor of 0.021 on a Snapdragon 662.
Can VoxRT detect a custom wake phrase?
Yes. The default reference phrase is "Hey Assistant", and custom phrases in any language are tuned per customer as a paid engagement. VoxRT does not offer a self-serve keyword-training console at v0.1 — custom phrases are delivered as a tuned model rather than generated in a UI.
What is a good real-time factor for a wake-word engine?
Real-time factor is the fraction of one CPU core needed to keep up with audio in real time, so lower is better. VoxRT measures a real-time factor of 0.021 — about 2.1% of one core — on a Snapdragon 662, and 0.015 on an iPhone 13 Pro Max, light enough to leave always-on. Picovoice publishes 0.6% CPU on a Raspberry Pi 5; scaled to a Snapdragon 662 that is roughly 1.8%, so the two are effectively a tie on raw speed.