✅ TICKET-006: Wake-word Detection Service - Implemented wake-word detection using openWakeWord - HTTP/WebSocket server on port 8002 - Real-time detection with configurable threshold - Event emission for ASR integration - Location: home-voice-agent/wake-word/ ✅ TICKET-010: ASR Service - Implemented ASR using faster-whisper - HTTP endpoint for file transcription - WebSocket endpoint for streaming transcription - Support for multiple audio formats - Auto language detection - GPU acceleration support - Location: home-voice-agent/asr/ ✅ TICKET-014: TTS Service - Implemented TTS using Piper - HTTP endpoint for text-to-speech synthesis - Low-latency processing (< 500ms) - Multiple voice support - WAV audio output - Location: home-voice-agent/tts/ ✅ TICKET-047: Updated Hardware Purchases - Marked Pi5 kit, SSD, microphone, and speakers as purchased - Updated progress log with purchase status 📚 Documentation: - Added VOICE_SERVICES_README.md with complete testing guide - Each service includes README.md with usage instructions - All services ready for Pi5 deployment 🧪 Testing: - Created test files for each service - All imports validated - FastAPI apps created successfully - Code passes syntax validation 🚀 Ready for: - Pi5 deployment - End-to-end voice flow testing - Integration with MCP server Files Added: - wake-word/detector.py - wake-word/server.py - wake-word/requirements.txt - wake-word/README.md - wake-word/test_detector.py - asr/service.py - asr/server.py - asr/requirements.txt - asr/README.md - asr/test_service.py - tts/service.py - tts/server.py - tts/requirements.txt - tts/README.md - tts/test_service.py - VOICE_SERVICES_README.md Files Modified: - tickets/done/TICKET-047_hardware-purchases.md Files Moved: - tickets/backlog/TICKET-006_prototype-wake-word-node.md → tickets/done/ - tickets/backlog/TICKET-010_streaming-asr-service.md → tickets/done/ - tickets/backlog/TICKET-014_tts-service.md → tickets/done/
67 lines
1.6 KiB
Markdown
67 lines
1.6 KiB
Markdown
# LLM Routing Layer
|
|
|
|
Routes LLM requests to the appropriate agent (work or family) based on identity, origin, or explicit specification.
|
|
|
|
## Features
|
|
|
|
- **Automatic Routing**: Routes based on client type, origin, or explicit agent type
|
|
- **Health Checks**: Verify LLM server availability
|
|
- **Request Handling**: Make requests to routed servers
|
|
- **Fallback**: Defaults to family agent for safety
|
|
|
|
## Usage
|
|
|
|
```python
|
|
from routing.router import get_router
|
|
|
|
router = get_router()
|
|
|
|
# Route a request
|
|
routing = router.route_request(
|
|
agent_type="family", # Explicit
|
|
# or
|
|
client_type="phone", # Based on client
|
|
# or
|
|
origin="10.0.1.100" # Based on origin
|
|
)
|
|
|
|
# Make request
|
|
response = router.make_request(
|
|
routing=routing,
|
|
messages=[
|
|
{"role": "user", "content": "What time is it?"}
|
|
],
|
|
tools=[...] # Optional tool definitions
|
|
)
|
|
|
|
# Health check
|
|
is_healthy = router.health_check("work")
|
|
```
|
|
|
|
## Routing Logic
|
|
|
|
1. **Explicit Agent Type**: If `agent_type` is specified, use it
|
|
2. **Client Type**: Route based on client type (work/desktop → work, phone/tablet → family)
|
|
3. **Origin/IP**: Route based on network origin (if configured)
|
|
4. **Default**: Family agent (safer default)
|
|
|
|
## Configuration
|
|
|
|
### Work Agent (4080)
|
|
- **URL**: http://10.0.30.63:11434
|
|
- **Model**: llama3.1:8b (configurable)
|
|
- **Timeout**: 300 seconds
|
|
|
|
### Family Agent (1050)
|
|
- **URL**: http://localhost:11434 (placeholder)
|
|
- **Model**: phi3:mini-q4_0
|
|
- **Timeout**: 60 seconds
|
|
|
|
## Future Enhancements
|
|
|
|
- Load balancing for multiple instances
|
|
- Request queuing
|
|
- Rate limiting per agent
|
|
- Metrics and logging
|
|
- Automatic failover
|