atlas/tickets/done/TICKET-017_llm-model-survey.md

# Ticket: Survey Candidate Open-Weight Models

## Ticket Information

- **ID**: TICKET-017
- **Title**: Survey Candidate Open-Weight Models
- **Type**: Research
- **Priority**: High
- **Status**: In Progress
- **Track**: LLM Infra
- **Milestone**: Milestone 1 - Survey & Architecture
- **Created**: 2024-01-XX

## Description

Survey and evaluate open-weight LLM models:
- 8-14B and 30B quantized options for RTX 4080 (Q4-Q6 variants)
- Small models for RTX 1050 (family agent)
- Evaluate coding/research capabilities for work agent
- Evaluate instruction-following for family agent

## Acceptance Criteria

- [x] Model comparison matrix created
- [x] 4080 model candidates identified (70B quantized, 33B alternatives)
- [x] 1050 model candidates identified (3.8B, 1.5B, 1.1B options)
- [x] Evaluation criteria documented
- [x] Recommendations documented

## Technical Details

Models to evaluate:
- 4080: Llama 3 8B/70B, Mistral 7B, Qwen, etc.
- 1050: TinyLlama, Phi-2, smaller quantized models
- Quantization: Q4, Q5, Q6, Q8
- Function calling support required

## Dependencies

- TICKET-004 (architecture) - helpful context

## Related Files

- `docs/LLM_MODEL_SURVEY.md` (to be created)
- `ARCHITECTURE.md`

## Notes

Can start in parallel with wake-word and clients. Depends on high-level architecture doc.

## Progress Log

- 2024-01-XX - Survey document created with comprehensive model analysis
- 2024-01-XX - Recommendations finalized:
  - Work Agent (4080): Llama 3.1 70B Q4 (primary), DeepSeek Coder 33B Q4 (alternative)
  - Family Agent (1050): Phi-3 Mini 3.8B Q4 (primary), Qwen2.5 1.5B Q4 (alternative)