- Enhanced `ARCHITECTURE.md` with details on LLM models for work (Llama 3.1 70B Q4) and family agents (Phi-3 Mini 3.8B Q4). - Introduced new documents: - `ASR_EVALUATION.md` for ASR engine evaluation and selection. - `HARDWARE.md` outlining hardware requirements and purchase plans. - `IMPLEMENTATION_GUIDE.md` for Milestone 2 implementation steps. - `LLM_CAPACITY.md` assessing VRAM and context window limits. - `LLM_MODEL_SURVEY.md` surveying open-weight LLM models. - `LLM_USAGE_AND_COSTS.md` detailing LLM usage and operational costs. - `MCP_ARCHITECTURE.md` describing the Model Context Protocol architecture. - `MCP_IMPLEMENTATION_SUMMARY.md` summarizing MCP implementation status. These updates provide comprehensive guidance for the next phases of development and ensure clarity in project documentation.
195 lines
7.4 KiB
Markdown
195 lines
7.4 KiB
Markdown
# Next Steps - Vibe Kanban Recommendations
|
|
|
|
## ✅ Completed Work
|
|
|
|
**Foundation (Done):**
|
|
- ✅ TICKET-001: Project Setup
|
|
- ✅ TICKET-002: Define Project Repos and Structure
|
|
- ✅ TICKET-003: Document Privacy Policy and Safety Constraints
|
|
- ✅ TICKET-004: High-Level Architecture Document
|
|
|
|
**Completed (Voice I/O Track):**
|
|
- ✅ TICKET-005: Evaluate and Select Wake-Word Engine → **Done**
|
|
- ✅ TICKET-009: Select ASR Engine and Target Hardware → **Done** - Selected: faster-whisper
|
|
- ✅ TICKET-013: Evaluate TTS Options → **Done**
|
|
|
|
**Completed (LLM Track):**
|
|
- ✅ TICKET-017: Survey Candidate Open-Weight Models → **Done**
|
|
- ✅ TICKET-018: LLM Capacity Assessment → **Done**
|
|
- ✅ TICKET-019: Select Work Agent Model (4080) → **Done** - Selected: Llama 3.1 70B Q4
|
|
- ✅ TICKET-020: Select Family Agent Model (1050) → **Done** - Selected: Phi-3 Mini 3.8B Q4
|
|
|
|
**Completed (Tools/MCP Track):**
|
|
- ✅ TICKET-028: Learn and Encode MCP Concepts → **Done** - MCP architecture documented
|
|
- ✅ TICKET-029: Implement Minimal MCP Server → **Done** - 6 tools running
|
|
- ✅ TICKET-030: Integrate MCP with LLM Host → **Done** - Adapter complete and tested
|
|
- ✅ TICKET-032: Time/Date Tools → **Done** - 4 tools implemented
|
|
|
|
**Completed (Planning & Evaluation):**
|
|
- ✅ TICKET-047: Hardware & Purchases → **Done** - Purchase plan created ($125-250 MVP)
|
|
|
|
**🎉 Milestone 1 Complete!** All evaluation and planning tasks are done.
|
|
**🚀 Milestone 2 Started!** MCP foundation complete - 3 implementation tickets done.
|
|
|
|
## 🎯 Recommended Next Steps
|
|
|
|
**MCP Foundation Complete!** ✅ Ready for LLM servers and voice I/O.
|
|
|
|
### Priority 1: Core Infrastructure (Start Here)
|
|
|
|
#### LLM Infrastructure Track ⭐ **Recommended First**
|
|
- **TICKET-021**: Stand Up 4080 LLM Service (Llama 3.1 70B Q4)
|
|
- **Why Now**: Core infrastructure - enables all LLM-dependent work
|
|
- **Time**: 4-6 hours
|
|
- **Blocks**: MCP integration, system prompts, tool calling
|
|
- **TICKET-022**: Stand Up 1050 LLM Service (Phi-3 Mini 3.8B Q4)
|
|
- **Why Now**: Can run in parallel with 4080 setup
|
|
- **Time**: 3-4 hours
|
|
- **Blocks**: Family agent features
|
|
|
|
#### Tools/MCP Track ✅ **COMPLETE**
|
|
- ✅ **TICKET-029**: Implement Minimal MCP Server → **Done**
|
|
- 6 tools running: echo, weather (stub), 4 time/date tools
|
|
- Server tested and operational
|
|
- ✅ **TICKET-030**: Integrate MCP with LLM Host → **Done**
|
|
- Adapter complete, all tests passing
|
|
- Ready for LLM server integration
|
|
- ✅ **TICKET-032**: Time/Date Tools → **Done**
|
|
- All 4 tools implemented and working
|
|
|
|
### Priority 2: More Tools (After LLM Servers)
|
|
|
|
#### Tools/MCP Track
|
|
- **TICKET-031**: Weather Tool (Real API)
|
|
- **Why Now**: Replace stub with actual weather API
|
|
- **Time**: 2-3 hours
|
|
- **Blocks**: None (can do now, but better after LLM integration)
|
|
- **TICKET-033**: Timers and Reminders
|
|
- **Why Now**: Useful tool for daily use
|
|
- **Time**: 4-6 hours
|
|
- **Blocks**: Timer service implementation
|
|
- **TICKET-034**: Home Tasks (Kanban)
|
|
- **Why Now**: Core productivity tool
|
|
- **Time**: 6-8 hours
|
|
- **Blocks**: Task management system
|
|
|
|
### Priority 3: Voice I/O Services (Can start in parallel)
|
|
|
|
#### Voice I/O Track
|
|
- **TICKET-006**: Prototype Local Wake-Word Node
|
|
- **Why Now**: Independent of other services
|
|
- **Time**: 4-6 hours
|
|
- **Blocks**: End-to-end voice flow
|
|
- **Note**: Requires hardware (microphone)
|
|
- **TICKET-010**: Implement Streaming Audio Capture → ASR Service
|
|
- **Why Now**: ASR engine selected (faster-whisper)
|
|
- **Time**: 6-8 hours
|
|
- **Blocks**: Voice input pipeline
|
|
- **TICKET-014**: Build TTS Service
|
|
- **Why Now**: TTS evaluation complete
|
|
- **Time**: 4-6 hours
|
|
- **Blocks**: Voice output pipeline
|
|
|
|
## 🚀 Recommended Vibe Kanban Setup
|
|
|
|
### Immediate Next Steps (This Week)
|
|
|
|
**Option A: Infrastructure First (Recommended)**
|
|
1. **TICKET-021** (4080 LLM Server) - Start here ⭐
|
|
- Core infrastructure, enables downstream work
|
|
- Can test with simple prompts immediately
|
|
- MCP adapter ready to integrate
|
|
2. **TICKET-022** (1050 LLM Server) - In parallel
|
|
- Similar setup, can reuse patterns from 021
|
|
3. **TICKET-031** (Weather Tool) - After LLM servers
|
|
- Replace stub with real API
|
|
- Test end-to-end tool calling
|
|
|
|
**Option B: Voice First (If Hardware Ready)**
|
|
1. **TICKET-006** (Wake-Word Prototype) - If you have hardware
|
|
- Fun, tangible progress
|
|
- Independent of other services
|
|
2. **TICKET-010** (ASR Service) - After wake-word
|
|
- Completes voice input pipeline
|
|
3. **TICKET-014** (TTS Service) - In parallel
|
|
- Completes voice output pipeline
|
|
|
|
### Parallel Work Strategy
|
|
- **High energy**: LLM server setup (021, 022) - technical, foundational
|
|
- **Medium energy**: Voice services (006, 010, 014) - hardware interaction
|
|
- **Low energy**: MCP server (029) - well-documented, structured work
|
|
- **Mix it up**: Switch between tracks to stay engaged!
|
|
|
|
## 📋 Milestone Progress
|
|
|
|
**✅ Milestone 1 - Survey & Architecture: COMPLETE**
|
|
- ✅ Foundation (001-004)
|
|
- ✅ Voice I/O evaluations: Wake-word (005), ASR (009), TTS (013)
|
|
- ✅ LLM evaluations: Model survey (017), Capacity (018), Selections (019, 020)
|
|
- ✅ MCP concepts (028)
|
|
- ✅ Hardware planning (047)
|
|
|
|
**🚀 Milestone 2 - Voice Chat + Weather + Tasks MVP: IN PROGRESS (15.8% Complete)**
|
|
- **Status**: MCP foundation complete! Ready for LLM servers and voice I/O
|
|
- **Completed**:
|
|
- ✅ MCP Server (029) - 6 tools running
|
|
- ✅ MCP Adapter (030) - Tested and working
|
|
- ✅ Time/Date Tools (032) - 4 tools implemented
|
|
- **Focus areas**:
|
|
- Voice I/O services (006, 010, 014) - Can start now
|
|
- LLM servers (021, 022) - **Recommended next**
|
|
- More tools (031, 033, 034) - After LLM servers
|
|
- **Goal**: End-to-end voice conversation with basic tools
|
|
- **Next**: TICKET-021 (4080 LLM Server), TICKET-022 (1050 LLM Server)
|
|
|
|
## 💡 Vibe Kanban Tips
|
|
|
|
1. **Tag by Track**: Voice I/O, LLM Infra, Tools/MCP, Project Setup
|
|
2. **Tag by Type**: Research, Implementation, Testing
|
|
3. **Tag by Energy Level**:
|
|
- High energy: Deep research (TICKET-017, TICKET-005)
|
|
- Medium energy: Documentation (TICKET-028, TICKET-018)
|
|
- Low energy: Planning (TICKET-047)
|
|
4. **Work in Sprints**: Do 1-2 hours on each, rotate based on interest
|
|
5. **Document as you go**: Each ticket produces a doc - update ARCHITECTURE.md
|
|
|
|
## ⚠️ Notes
|
|
|
|
- **All Milestone 1 tickets are complete!** 🎉
|
|
- **TICKET-021 & TICKET-022** (LLM servers) - No blockers, can start immediately
|
|
- **TICKET-029** (MCP Server) - Can start now, MCP concepts are documented
|
|
- **Voice I/O** (006, 010, 014) - Can proceed in parallel with LLM work
|
|
- **TICKET-030** (MCP-LLM Integration) - Needs both TICKET-029 and TICKET-021 complete
|
|
- All implementation tickets can be worked on in parallel across tracks
|
|
|
|
## 🎯 Recommended Starting Point
|
|
|
|
**Best path to MVP:**
|
|
|
|
1. **Start with LLM Infrastructure** (021, 022)
|
|
- Sets up core capabilities
|
|
- Can test immediately with simple prompts
|
|
- Enables MCP integration work
|
|
|
|
2. ✅ **Build MCP Foundation** (029, 030, 032) - **COMPLETE**
|
|
- MCP server running with 6 tools
|
|
- Adapter tested and working
|
|
- Ready for LLM integration
|
|
|
|
3. **Add Voice I/O** (006, 010, 014)
|
|
- Can work in parallel with LLM/MCP
|
|
- Completes end-to-end voice pipeline
|
|
- More fun/tangible progress
|
|
|
|
4. **Add First Tools** (031, 032, 034)
|
|
- Weather, time, tasks
|
|
- Makes the system useful
|
|
- Can test end-to-end
|
|
|
|
5. **Build Client** (039, 040)
|
|
- Phone PWA and web dashboard
|
|
- Makes system accessible
|
|
- Final piece for MVP
|
|
|
|
**This gets you to a working MVP faster!** 🚀
|