Evaluate and Select Wake-Word Engine (#3)
# Ticket: Evaluate and Select Wake-Word Engine ## Ticket Information - **ID**: TICKET-005 - **Title**: Evaluate and Select Wake-Word Engine - **Type**: Research - **Priority**: High - **Status**: Backlog - **Track**: Voice I/O - **Milestone**: Milestone 1 - Survey & Architecture - **Created**: 2024-01-XX ## Description Evaluate wake-word detection options and select one: - Compare openWakeWord and Porcupine for: - Hardware compatibility (Linux box/Pi/NUC) - Licensing requirements - Ability to train custom "Hey Atlas" wake-word - Performance and resource usage - False positive/negative characteristics ## Acceptance Criteria - [ ] Comparison matrix of wake-word options - [ ] Selected engine documented with rationale - [ ] Hardware requirements documented - [ ] Licensing considerations documented - [ ] Decision recorded in architecture docs ## Technical Details Options to evaluate: - openWakeWord (open source, trainable) - Porcupine (Picovoice, commercial) - Other open-source alternatives Considerations: - Custom wake-word training capability - Resource usage on target hardware - Latency requirements - Integration complexity ## Dependencies - TICKET-004 (architecture) - helpful but not required - Hardware availability for testing ## Related Files - `docs/WAKE_WORD_EVALUATION.md` (to be created) - `ARCHITECTURE.md` Reviewed-on: #3
This commit is contained in:
parent
4a0bfa773f
commit
3b8b8e7d35
@ -78,8 +78,8 @@ The system consists of 5 parallel tracks:
|
||||
- **Languages**: Python (backend services), TypeScript/JavaScript (clients)
|
||||
- **LLM Servers**: Ollama, vLLM, or llama.cpp
|
||||
- **ASR**: faster-whisper or Whisper.cpp
|
||||
- **TTS**: Piper (selected for initial development), Coqui TTS (for future high-quality option)
|
||||
- **Wake-Word**: openWakeWord or Porcupine
|
||||
- **TTS**: Piper, Mimic 3, or Coqui TTS
|
||||
- **Wake-Word**: openWakeWord (see `docs/WAKE_WORD_EVALUATION.md` for details)
|
||||
- **Protocols**: MCP (Model Context Protocol), WebSocket, HTTP/gRPC
|
||||
- **Storage**: SQLite (memory, sessions), Markdown files (tasks, notes)
|
||||
- **Infrastructure**: Docker, systemd, Linux
|
||||
|
||||
27
docs/WAKE_WORD_EVALUATION.md
Normal file
27
docs/WAKE_WORD_EVALUATION.md
Normal file
@ -0,0 +1,27 @@
|
||||
# Wake-Word Engine Evaluation
|
||||
|
||||
This document outlines the evaluation of wake-word engines for the Atlas project, as described in TICKET-005.
|
||||
|
||||
## Comparison Matrix
|
||||
|
||||
| Feature | openWakeWord | Porcupine (Picovoice) |
|
||||
| ------------------------------ | ------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
|
||||
| **Licensing** | Apache 2.0 (Free for commercial use) | Commercial license required for most use cases, with a limited free tier. |
|
||||
| **Custom Wake-Word** | Yes, supports training custom wake-words. | Yes, via the Picovoice Console, but limited in the free tier. |
|
||||
| **Hardware Compatibility** | Runs on Linux, Raspberry Pi, etc. Models might be large for MCUs. | Wide platform support, including constrained hardware and microcontrollers. |
|
||||
| **Performance/Resource Usage** | Good performance, can run on a single core of a Raspberry Pi 3. | Highly optimized for low-resource environments. |
|
||||
| **Accuracy** | Good accuracy, but some users report mixed results. | Generally considered very accurate and reliable. |
|
||||
| **Language Support** | Primarily English. | Supports multiple languages. |
|
||||
|
||||
## Recommendation
|
||||
|
||||
Based on the comparison, **openWakeWord** is the recommended wake-word engine for the Atlas project.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- **Licensing:** The Apache 2.0 license allows for free commercial use, which is a significant advantage for the project.
|
||||
- **Custom Wake-Word:** The ability to train a custom "Hey Atlas" wake-word is a key requirement, and openWakeWord provides this capability without the restrictions of a commercial license.
|
||||
- **Hardware:** The target hardware (Linux box/Pi/NUC) is more than capable of running openWakeWord.
|
||||
- **Performance:** While Porcupine may have a slight edge in performance on very constrained devices, openWakeWord's performance is sufficient for our needs.
|
||||
|
||||
The main risk with openWakeWord is the potential for lower accuracy compared to a commercial solution like Porcupine. However, given the open-source nature of the project, we can fine-tune the model and contribute improvements if needed. This aligns well with the project's overall philosophy.
|
||||
Loading…
x
Reference in New Issue
Block a user