atlas/tickets/NEXT_STEPS.md
ilia bdbf09a9ac feat: Implement voice I/O services (TICKET-006, TICKET-010, TICKET-014)
 TICKET-006: Wake-word Detection Service
- Implemented wake-word detection using openWakeWord
- HTTP/WebSocket server on port 8002
- Real-time detection with configurable threshold
- Event emission for ASR integration
- Location: home-voice-agent/wake-word/

 TICKET-010: ASR Service
- Implemented ASR using faster-whisper
- HTTP endpoint for file transcription
- WebSocket endpoint for streaming transcription
- Support for multiple audio formats
- Auto language detection
- GPU acceleration support
- Location: home-voice-agent/asr/

 TICKET-014: TTS Service
- Implemented TTS using Piper
- HTTP endpoint for text-to-speech synthesis
- Low-latency processing (< 500ms)
- Multiple voice support
- WAV audio output
- Location: home-voice-agent/tts/

 TICKET-047: Updated Hardware Purchases
- Marked Pi5 kit, SSD, microphone, and speakers as purchased
- Updated progress log with purchase status

📚 Documentation:
- Added VOICE_SERVICES_README.md with complete testing guide
- Each service includes README.md with usage instructions
- All services ready for Pi5 deployment

🧪 Testing:
- Created test files for each service
- All imports validated
- FastAPI apps created successfully
- Code passes syntax validation

🚀 Ready for:
- Pi5 deployment
- End-to-end voice flow testing
- Integration with MCP server

Files Added:
- wake-word/detector.py
- wake-word/server.py
- wake-word/requirements.txt
- wake-word/README.md
- wake-word/test_detector.py
- asr/service.py
- asr/server.py
- asr/requirements.txt
- asr/README.md
- asr/test_service.py
- tts/service.py
- tts/server.py
- tts/requirements.txt
- tts/README.md
- tts/test_service.py
- VOICE_SERVICES_README.md

Files Modified:
- tickets/done/TICKET-047_hardware-purchases.md

Files Moved:
- tickets/backlog/TICKET-006_prototype-wake-word-node.md → tickets/done/
- tickets/backlog/TICKET-010_streaming-asr-service.md → tickets/done/
- tickets/backlog/TICKET-014_tts-service.md → tickets/done/
2026-01-12 22:22:38 -05:00

9.5 KiB

Next Steps - Vibe Kanban Recommendations

Completed Work

Foundation (Done):

  • TICKET-001: Project Setup
  • TICKET-002: Define Project Repos and Structure
  • TICKET-003: Document Privacy Policy and Safety Constraints
  • TICKET-004: High-Level Architecture Document

Completed (Voice I/O Track):

  • TICKET-005: Evaluate and Select Wake-Word Engine → Done
  • TICKET-009: Select ASR Engine and Target Hardware → Done - Selected: faster-whisper
  • TICKET-013: Evaluate TTS Options → Done

Completed (LLM Track):

  • TICKET-017: Survey Candidate Open-Weight Models → Done
  • TICKET-018: LLM Capacity Assessment → Done
  • TICKET-019: Select Work Agent Model (4080) → Done - Selected: Llama 3.1 70B Q4
  • TICKET-020: Select Family Agent Model (1050) → Done - Selected: Phi-3 Mini 3.8B Q4

Completed (Tools/MCP Track):

  • TICKET-028: Learn and Encode MCP Concepts → Done - MCP architecture documented
  • TICKET-029: Implement Minimal MCP Server → Done - 18 tools running
  • TICKET-030: Integrate MCP with LLM Host → Done - Adapter complete and tested
  • TICKET-032: Time/Date Tools → Done - 4 tools implemented
  • TICKET-031: Weather Tool → Done - OpenWeatherMap API integrated
  • TICKET-033: Timers and Reminders → Done - 4 tools implemented
  • TICKET-034: Home Tasks (Kanban) → Done - 3 tools implemented
  • TICKET-035: Notes & Files Tools → Done - 5 tools implemented

Completed (LLM Infrastructure Track):

  • TICKET-021: Stand Up 4080 LLM Service → Done - Connected to http://10.0.30.63:11434
  • TICKET-025: System Prompts → Done - Family and work agent prompts created
  • TICKET-026: Tool-Calling Policy → Done - Policy documented
  • TICKET-027: Multi-turn Conversation Handling → Done - Session manager implemented
  • TICKET-023: LLM Routing Layer → Done - Router implemented
  • TICKET-024: LLM Logging & Metrics → Done - Logging and metrics implemented

Completed (Safety/Memory Track):

  • TICKET-044: Boundary Enforcement → Done - Path/tool/network boundaries
  • TICKET-045: Confirmation Flows → Done - Risk classification and tokens
  • TICKET-041: Long-Term Memory Design → Done - Memory schema and storage implemented
  • TICKET-043: Conversation Summarization & Pruning → Done - Summarization and retention implemented
  • TICKET-042: Memory Implementation → Done - 4 memory tools added to MCP server

Completed (Voice I/O Track):

  • TICKET-011: Define ASR API Contract → Done - API contract documented

Completed (Clients/UI Track):

  • TICKET-040: Web LAN Dashboard → Done - Dashboard API and web interface implemented
  • TICKET-039: Phone-Friendly Client (PWA) → Done - PWA with text input, conversation persistence, error handling

Completed (Planning & Evaluation):

  • TICKET-047: Hardware & Purchases → Done - Purchase plan created ($125-250 MVP)

🎉 Milestone 1 Complete! All evaluation and planning tasks are done.
🚀 Milestone 2 Started! MCP foundation complete - 3 implementation tickets done.

MCP Foundation Complete! Ready for LLM servers and voice I/O.

Priority 1: Core Infrastructure (Start Here)

LLM Infrastructure Track 4080 COMPLETE

  • TICKET-021: Stand Up 4080 LLM Service → Done
  • TICKET-022: Stand Up 1050 LLM Service (Phi-3 Mini 3.8B Q4)
    • Why Now: Can run in parallel with 4080 setup
    • Time: 3-4 hours
    • Blocks: Family agent features

Tools/MCP Track COMPLETE

  • TICKET-029: Implement Minimal MCP Server → Done
    • 18 tools running: echo, weather, 4 time/date, 4 timer/reminder, 3 tasks, 5 notes
    • Server tested and operational
  • TICKET-030: Integrate MCP with LLM Host → Done
    • Adapter complete, all tests passing
    • Ready for LLM server integration
  • TICKET-032: Time/Date Tools → Done
    • All 4 tools implemented and working
  • TICKET-033: Timers and Reminders → Done
    • All 4 tools implemented and working
  • TICKET-034: Home Tasks (Kanban) → Done
    • All 3 tools implemented and working
  • TICKET-035: Notes & Files Tools → Done
    • All 5 tools implemented and working

Priority 2: More Tools (After LLM Servers)

Tools/MCP Track ALL CORE TOOLS COMPLETE

  • TICKET-031: Weather Tool → Done
  • TICKET-033: Timers and Reminders → Done
  • TICKET-034: Home Tasks (Kanban) → Done
  • TICKET-035: Notes & Files Tools → Done

Priority 3: Voice I/O Services (Can start in parallel)

Voice I/O Track

  • TICKET-006: Prototype Local Wake-Word Node
    • Why Now: Independent of other services
    • Time: 4-6 hours
    • Blocks: End-to-end voice flow
    • Note: Requires hardware (microphone)
  • TICKET-010: Implement Streaming Audio Capture → ASR Service
    • Why Now: ASR engine selected (faster-whisper)
    • Time: 6-8 hours
    • Blocks: Voice input pipeline
  • TICKET-014: Build TTS Service
    • Why Now: TTS evaluation complete
    • Time: 4-6 hours
    • Blocks: Voice output pipeline

Immediate Next Steps (This Week)

Option A: Infrastructure First (Recommended)

  1. TICKET-021 (4080 LLM Server) - Start here
    • Core infrastructure, enables downstream work
    • Can test with simple prompts immediately
    • MCP adapter ready to integrate
  2. TICKET-022 (1050 LLM Server) - In parallel
    • Similar setup, can reuse patterns from 021
  3. TICKET-031 (Weather Tool) - After LLM servers
    • Replace stub with real API
    • Test end-to-end tool calling

Option B: Voice First (If Hardware Ready)

  1. TICKET-006 (Wake-Word Prototype) - If you have hardware
    • Fun, tangible progress
    • Independent of other services
  2. TICKET-010 (ASR Service) - After wake-word
    • Completes voice input pipeline
  3. TICKET-014 (TTS Service) - In parallel
    • Completes voice output pipeline

Parallel Work Strategy

  • High energy: LLM server setup (021, 022) - technical, foundational
  • Medium energy: Voice services (006, 010, 014) - hardware interaction
  • Low energy: MCP server (029) - well-documented, structured work
  • Mix it up: Switch between tracks to stay engaged!

📋 Milestone Progress

Milestone 1 - Survey & Architecture: COMPLETE

  • Foundation (001-004)
  • Voice I/O evaluations: Wake-word (005), ASR (009), TTS (013)
  • LLM evaluations: Model survey (017), Capacity (018), Selections (019, 020)
  • MCP concepts (028)
  • Hardware planning (047)

🚀 Milestone 2 - Voice Chat + Weather + Tasks MVP: IN PROGRESS (47.4% Complete)

  • Status: MCP foundation complete! 18 tools running, LLM server connected
  • Completed:
    • MCP Server (029) - 18 tools running
    • MCP Adapter (030) - Tested and working
    • Time/Date Tools (032) - 4 tools implemented
    • 4080 LLM Server (021) - Connected and tested
    • Weather Tool (031) - OpenWeatherMap API integrated
    • Timers and Reminders (033) - 4 tools implemented
    • Home Tasks (034) - 3 tools implemented
    • Notes & Files (035) - 5 tools implemented
    • Phone PWA (039) - Enhanced with text input, persistence, error handling
  • Focus areas:
    • Voice I/O services (006, 010, 014) - Can start now
    • LLM servers (021, 022) - Recommended next
    • More tools (031, 033, 034) - After LLM servers
  • Goal: End-to-end voice conversation with basic tools
  • Next: TICKET-021 (4080 LLM Server), TICKET-022 (1050 LLM Server)

💡 Vibe Kanban Tips

  1. Tag by Track: Voice I/O, LLM Infra, Tools/MCP, Project Setup
  2. Tag by Type: Research, Implementation, Testing
  3. Tag by Energy Level:
    • High energy: Deep research (TICKET-017, TICKET-005)
    • Medium energy: Documentation (TICKET-028, TICKET-018)
    • Low energy: Planning (TICKET-047)
  4. Work in Sprints: Do 1-2 hours on each, rotate based on interest
  5. Document as you go: Each ticket produces a doc - update ARCHITECTURE.md

⚠️ Notes

  • All Milestone 1 tickets are complete! 🎉
  • TICKET-021 & TICKET-022 (LLM servers) - No blockers, can start immediately
  • TICKET-029 (MCP Server) - Can start now, MCP concepts are documented
  • Voice I/O (006, 010, 014) - Can proceed in parallel with LLM work
  • TICKET-030 (MCP-LLM Integration) - Needs both TICKET-029 and TICKET-021 complete
  • All implementation tickets can be worked on in parallel across tracks

Best path to MVP:

  1. Start with LLM Infrastructure (021, 022)

    • Sets up core capabilities
    • Can test immediately with simple prompts
    • Enables MCP integration work
  2. Build MCP Foundation (029, 030, 032) - COMPLETE

    • MCP server running with 6 tools
    • Adapter tested and working
    • Ready for LLM integration
  3. Add Voice I/O (006, 010, 014)

    • Can work in parallel with LLM/MCP
    • Completes end-to-end voice pipeline
    • More fun/tangible progress
  4. Add First Tools (031, 032, 034)

    • Weather, time, tasks
    • Makes the system useful
    • Can test end-to-end
  5. Build Client (039, 040)

    • Phone PWA and web dashboard
    • Makes system accessible
    • Final piece for MVP

This gets you to a working MVP faster! 🚀