# Ticket: Survey Candidate Open-Weight Models ## Ticket Information - **ID**: TICKET-017 - **Title**: Survey Candidate Open-Weight Models - **Type**: Research - **Priority**: High - **Status**: In Progress - **Track**: LLM Infra - **Milestone**: Milestone 1 - Survey & Architecture - **Created**: 2024-01-XX ## Description Survey and evaluate open-weight LLM models: - 8-14B and 30B quantized options for RTX 4080 (Q4-Q6 variants) - Small models for RTX 1050 (family agent) - Evaluate coding/research capabilities for work agent - Evaluate instruction-following for family agent ## Acceptance Criteria - [x] Model comparison matrix created - [x] 4080 model candidates identified (70B quantized, 33B alternatives) - [x] 1050 model candidates identified (3.8B, 1.5B, 1.1B options) - [x] Evaluation criteria documented - [x] Recommendations documented ## Technical Details Models to evaluate: - 4080: Llama 3 8B/70B, Mistral 7B, Qwen, etc. - 1050: TinyLlama, Phi-2, smaller quantized models - Quantization: Q4, Q5, Q6, Q8 - Function calling support required ## Dependencies - TICKET-004 (architecture) - helpful context ## Related Files - `docs/LLM_MODEL_SURVEY.md` (to be created) - `ARCHITECTURE.md` ## Notes Can start in parallel with wake-word and clients. Depends on high-level architecture doc. ## Progress Log - 2024-01-XX - Survey document created with comprehensive model analysis - 2024-01-XX - Recommendations finalized: - Work Agent (4080): Llama 3.1 70B Q4 (primary), DeepSeek Coder 33B Q4 (alternative) - Family Agent (1050): Phi-3 Mini 3.8B Q4 (primary), Qwen2.5 1.5B Q4 (alternative)