✅ TICKET-006: Wake-word Detection Service - Implemented wake-word detection using openWakeWord - HTTP/WebSocket server on port 8002 - Real-time detection with configurable threshold - Event emission for ASR integration - Location: home-voice-agent/wake-word/ ✅ TICKET-010: ASR Service - Implemented ASR using faster-whisper - HTTP endpoint for file transcription - WebSocket endpoint for streaming transcription - Support for multiple audio formats - Auto language detection - GPU acceleration support - Location: home-voice-agent/asr/ ✅ TICKET-014: TTS Service - Implemented TTS using Piper - HTTP endpoint for text-to-speech synthesis - Low-latency processing (< 500ms) - Multiple voice support - WAV audio output - Location: home-voice-agent/tts/ ✅ TICKET-047: Updated Hardware Purchases - Marked Pi5 kit, SSD, microphone, and speakers as purchased - Updated progress log with purchase status 📚 Documentation: - Added VOICE_SERVICES_README.md with complete testing guide - Each service includes README.md with usage instructions - All services ready for Pi5 deployment 🧪 Testing: - Created test files for each service - All imports validated - FastAPI apps created successfully - Code passes syntax validation 🚀 Ready for: - Pi5 deployment - End-to-end voice flow testing - Integration with MCP server Files Added: - wake-word/detector.py - wake-word/server.py - wake-word/requirements.txt - wake-word/README.md - wake-word/test_detector.py - asr/service.py - asr/server.py - asr/requirements.txt - asr/README.md - asr/test_service.py - tts/service.py - tts/server.py - tts/requirements.txt - tts/README.md - tts/test_service.py - VOICE_SERVICES_README.md Files Modified: - tickets/done/TICKET-047_hardware-purchases.md Files Moved: - tickets/backlog/TICKET-006_prototype-wake-word-node.md → tickets/done/ - tickets/backlog/TICKET-010_streaming-asr-service.md → tickets/done/ - tickets/backlog/TICKET-014_tts-service.md → tickets/done/
Home Voice Agent
Main mono-repo for the Atlas voice agent system.
🚀 Quick Start
Get started in 5 minutes: See QUICK_START.md
Test the system: Run ./test_all.sh or ./run_tests.sh
Configure environment: See ENV_CONFIG.md
Testing guide: See TESTING.md
Test coverage: See TEST_COVERAGE.md
Improvements & next steps: See IMPROVEMENTS_AND_NEXT_STEPS.md
Project Structure
home-voice-agent/
├── llm-servers/ # LLM inference servers
│ ├── 4080/ # Work agent (Llama 3.1 70B Q4)
│ └── 1050/ # Family agent (Phi-3 Mini 3.8B Q4)
├── mcp-server/ # MCP tool server (JSON-RPC 2.0)
├── wake-word/ # Wake-word detection node
├── asr/ # ASR service (faster-whisper)
├── tts/ # TTS service
├── clients/ # Front-end applications
│ ├── phone/ # Phone PWA
│ └── web-dashboard/ # Web dashboard
├── routing/ # LLM routing layer
├── conversation/ # Conversation management
├── memory/ # Long-term memory
├── safety/ # Safety and boundary enforcement
├── admin/ # Admin tools
└── infrastructure/ # Deployment scripts, Dockerfiles
Quick Start
1. MCP Server
cd mcp-server
pip install -r requirements.txt
python server/mcp_server.py
# Server runs on http://localhost:8000
2. LLM Servers
4080 Server (Work Agent):
cd llm-servers/4080
./setup.sh
ollama serve
1050 Server (Family Agent):
cd llm-servers/1050
./setup.sh
ollama serve --host 0.0.0.0
Status
- ✅ MCP Server: Implemented (TICKET-029)
- 🔄 LLM Servers: Setup scripts ready (TICKET-021, TICKET-022)
- ⏳ Voice I/O: Pending (TICKET-006, TICKET-010, TICKET-014)
- ⏳ Clients: Pending (TICKET-039, TICKET-040)
Documentation
See parent atlas/ repo for:
- Architecture documentation
- Technology evaluations
- Implementation guides
- Ticket tracking