- Enhanced `ARCHITECTURE.md` with details on LLM models for work (Llama 3.1 70B Q4) and family agents (Phi-3 Mini 3.8B Q4). - Introduced new documents: - `ASR_EVALUATION.md` for ASR engine evaluation and selection. - `HARDWARE.md` outlining hardware requirements and purchase plans. - `IMPLEMENTATION_GUIDE.md` for Milestone 2 implementation steps. - `LLM_CAPACITY.md` assessing VRAM and context window limits. - `LLM_MODEL_SURVEY.md` surveying open-weight LLM models. - `LLM_USAGE_AND_COSTS.md` detailing LLM usage and operational costs. - `MCP_ARCHITECTURE.md` describing the Model Context Protocol architecture. - `MCP_IMPLEMENTATION_SUMMARY.md` summarizing MCP implementation status. These updates provide comprehensive guidance for the next phases of development and ensure clarity in project documentation.
7.4 KiB
Next Steps - Vibe Kanban Recommendations
✅ Completed Work
Foundation (Done):
- ✅ TICKET-001: Project Setup
- ✅ TICKET-002: Define Project Repos and Structure
- ✅ TICKET-003: Document Privacy Policy and Safety Constraints
- ✅ TICKET-004: High-Level Architecture Document
Completed (Voice I/O Track):
- ✅ TICKET-005: Evaluate and Select Wake-Word Engine → Done
- ✅ TICKET-009: Select ASR Engine and Target Hardware → Done - Selected: faster-whisper
- ✅ TICKET-013: Evaluate TTS Options → Done
Completed (LLM Track):
- ✅ TICKET-017: Survey Candidate Open-Weight Models → Done
- ✅ TICKET-018: LLM Capacity Assessment → Done
- ✅ TICKET-019: Select Work Agent Model (4080) → Done - Selected: Llama 3.1 70B Q4
- ✅ TICKET-020: Select Family Agent Model (1050) → Done - Selected: Phi-3 Mini 3.8B Q4
Completed (Tools/MCP Track):
- ✅ TICKET-028: Learn and Encode MCP Concepts → Done - MCP architecture documented
- ✅ TICKET-029: Implement Minimal MCP Server → Done - 6 tools running
- ✅ TICKET-030: Integrate MCP with LLM Host → Done - Adapter complete and tested
- ✅ TICKET-032: Time/Date Tools → Done - 4 tools implemented
Completed (Planning & Evaluation):
- ✅ TICKET-047: Hardware & Purchases → Done - Purchase plan created ($125-250 MVP)
🎉 Milestone 1 Complete! All evaluation and planning tasks are done.
🚀 Milestone 2 Started! MCP foundation complete - 3 implementation tickets done.
🎯 Recommended Next Steps
MCP Foundation Complete! ✅ Ready for LLM servers and voice I/O.
Priority 1: Core Infrastructure (Start Here)
LLM Infrastructure Track ⭐ Recommended First
- TICKET-021: Stand Up 4080 LLM Service (Llama 3.1 70B Q4)
- Why Now: Core infrastructure - enables all LLM-dependent work
- Time: 4-6 hours
- Blocks: MCP integration, system prompts, tool calling
- TICKET-022: Stand Up 1050 LLM Service (Phi-3 Mini 3.8B Q4)
- Why Now: Can run in parallel with 4080 setup
- Time: 3-4 hours
- Blocks: Family agent features
Tools/MCP Track ✅ COMPLETE
- ✅ TICKET-029: Implement Minimal MCP Server → Done
- 6 tools running: echo, weather (stub), 4 time/date tools
- Server tested and operational
- ✅ TICKET-030: Integrate MCP with LLM Host → Done
- Adapter complete, all tests passing
- Ready for LLM server integration
- ✅ TICKET-032: Time/Date Tools → Done
- All 4 tools implemented and working
Priority 2: More Tools (After LLM Servers)
Tools/MCP Track
- TICKET-031: Weather Tool (Real API)
- Why Now: Replace stub with actual weather API
- Time: 2-3 hours
- Blocks: None (can do now, but better after LLM integration)
- TICKET-033: Timers and Reminders
- Why Now: Useful tool for daily use
- Time: 4-6 hours
- Blocks: Timer service implementation
- TICKET-034: Home Tasks (Kanban)
- Why Now: Core productivity tool
- Time: 6-8 hours
- Blocks: Task management system
Priority 3: Voice I/O Services (Can start in parallel)
Voice I/O Track
- TICKET-006: Prototype Local Wake-Word Node
- Why Now: Independent of other services
- Time: 4-6 hours
- Blocks: End-to-end voice flow
- Note: Requires hardware (microphone)
- TICKET-010: Implement Streaming Audio Capture → ASR Service
- Why Now: ASR engine selected (faster-whisper)
- Time: 6-8 hours
- Blocks: Voice input pipeline
- TICKET-014: Build TTS Service
- Why Now: TTS evaluation complete
- Time: 4-6 hours
- Blocks: Voice output pipeline
🚀 Recommended Vibe Kanban Setup
Immediate Next Steps (This Week)
Option A: Infrastructure First (Recommended)
- TICKET-021 (4080 LLM Server) - Start here ⭐
- Core infrastructure, enables downstream work
- Can test with simple prompts immediately
- MCP adapter ready to integrate
- TICKET-022 (1050 LLM Server) - In parallel
- Similar setup, can reuse patterns from 021
- TICKET-031 (Weather Tool) - After LLM servers
- Replace stub with real API
- Test end-to-end tool calling
Option B: Voice First (If Hardware Ready)
- TICKET-006 (Wake-Word Prototype) - If you have hardware
- Fun, tangible progress
- Independent of other services
- TICKET-010 (ASR Service) - After wake-word
- Completes voice input pipeline
- TICKET-014 (TTS Service) - In parallel
- Completes voice output pipeline
Parallel Work Strategy
- High energy: LLM server setup (021, 022) - technical, foundational
- Medium energy: Voice services (006, 010, 014) - hardware interaction
- Low energy: MCP server (029) - well-documented, structured work
- Mix it up: Switch between tracks to stay engaged!
📋 Milestone Progress
✅ Milestone 1 - Survey & Architecture: COMPLETE
- ✅ Foundation (001-004)
- ✅ Voice I/O evaluations: Wake-word (005), ASR (009), TTS (013)
- ✅ LLM evaluations: Model survey (017), Capacity (018), Selections (019, 020)
- ✅ MCP concepts (028)
- ✅ Hardware planning (047)
🚀 Milestone 2 - Voice Chat + Weather + Tasks MVP: IN PROGRESS (15.8% Complete)
- Status: MCP foundation complete! Ready for LLM servers and voice I/O
- Completed:
- ✅ MCP Server (029) - 6 tools running
- ✅ MCP Adapter (030) - Tested and working
- ✅ Time/Date Tools (032) - 4 tools implemented
- Focus areas:
- Voice I/O services (006, 010, 014) - Can start now
- LLM servers (021, 022) - Recommended next
- More tools (031, 033, 034) - After LLM servers
- Goal: End-to-end voice conversation with basic tools
- Next: TICKET-021 (4080 LLM Server), TICKET-022 (1050 LLM Server)
💡 Vibe Kanban Tips
- Tag by Track: Voice I/O, LLM Infra, Tools/MCP, Project Setup
- Tag by Type: Research, Implementation, Testing
- Tag by Energy Level:
- High energy: Deep research (TICKET-017, TICKET-005)
- Medium energy: Documentation (TICKET-028, TICKET-018)
- Low energy: Planning (TICKET-047)
- Work in Sprints: Do 1-2 hours on each, rotate based on interest
- Document as you go: Each ticket produces a doc - update ARCHITECTURE.md
⚠️ Notes
- All Milestone 1 tickets are complete! 🎉
- TICKET-021 & TICKET-022 (LLM servers) - No blockers, can start immediately
- TICKET-029 (MCP Server) - Can start now, MCP concepts are documented
- Voice I/O (006, 010, 014) - Can proceed in parallel with LLM work
- TICKET-030 (MCP-LLM Integration) - Needs both TICKET-029 and TICKET-021 complete
- All implementation tickets can be worked on in parallel across tracks
🎯 Recommended Starting Point
Best path to MVP:
-
Start with LLM Infrastructure (021, 022)
- Sets up core capabilities
- Can test immediately with simple prompts
- Enables MCP integration work
-
✅ Build MCP Foundation (029, 030, 032) - COMPLETE
- MCP server running with 6 tools
- Adapter tested and working
- Ready for LLM integration
-
Add Voice I/O (006, 010, 014)
- Can work in parallel with LLM/MCP
- Completes end-to-end voice pipeline
- More fun/tangible progress
-
Add First Tools (031, 032, 034)
- Weather, time, tasks
- Makes the system useful
- Can test end-to-end
-
Build Client (039, 040)
- Phone PWA and web dashboard
- Makes system accessible
- Final piece for MVP
This gets you to a working MVP faster! 🚀