- Enhanced `ARCHITECTURE.md` with details on LLM models for work (Llama 3.1 70B Q4) and family agents (Phi-3 Mini 3.8B Q4). - Introduced new documents: - `ASR_EVALUATION.md` for ASR engine evaluation and selection. - `HARDWARE.md` outlining hardware requirements and purchase plans. - `IMPLEMENTATION_GUIDE.md` for Milestone 2 implementation steps. - `LLM_CAPACITY.md` assessing VRAM and context window limits. - `LLM_MODEL_SURVEY.md` surveying open-weight LLM models. - `LLM_USAGE_AND_COSTS.md` detailing LLM usage and operational costs. - `MCP_ARCHITECTURE.md` describing the Model Context Protocol architecture. - `MCP_IMPLEMENTATION_SUMMARY.md` summarizing MCP implementation status. These updates provide comprehensive guidance for the next phases of development and ensure clarity in project documentation.
1.6 KiB
1.6 KiB
Ticket: Survey Candidate Open-Weight Models
Ticket Information
- ID: TICKET-017
- Title: Survey Candidate Open-Weight Models
- Type: Research
- Priority: High
- Status: In Progress
- Track: LLM Infra
- Milestone: Milestone 1 - Survey & Architecture
- Created: 2024-01-XX
Description
Survey and evaluate open-weight LLM models:
- 8-14B and 30B quantized options for RTX 4080 (Q4-Q6 variants)
- Small models for RTX 1050 (family agent)
- Evaluate coding/research capabilities for work agent
- Evaluate instruction-following for family agent
Acceptance Criteria
- Model comparison matrix created
- 4080 model candidates identified (70B quantized, 33B alternatives)
- 1050 model candidates identified (3.8B, 1.5B, 1.1B options)
- Evaluation criteria documented
- Recommendations documented
Technical Details
Models to evaluate:
- 4080: Llama 3 8B/70B, Mistral 7B, Qwen, etc.
- 1050: TinyLlama, Phi-2, smaller quantized models
- Quantization: Q4, Q5, Q6, Q8
- Function calling support required
Dependencies
- TICKET-004 (architecture) - helpful context
Related Files
docs/LLM_MODEL_SURVEY.md(to be created)ARCHITECTURE.md
Notes
Can start in parallel with wake-word and clients. Depends on high-level architecture doc.
Progress Log
- 2024-01-XX - Survey document created with comprehensive model analysis
- 2024-01-XX - Recommendations finalized:
- Work Agent (4080): Llama 3.1 70B Q4 (primary), DeepSeek Coder 33B Q4 (alternative)
- Family Agent (1050): Phi-3 Mini 3.8B Q4 (primary), Qwen2.5 1.5B Q4 (alternative)