atlas/tickets/backlog/TICKET-019_select-work-agent-model.md
ilia 4b9ffb5ddf docs: Update architecture and add new documentation for LLM and MCP
- Enhanced `ARCHITECTURE.md` with details on LLM models for work (Llama 3.1 70B Q4) and family agents (Phi-3 Mini 3.8B Q4).
- Introduced new documents:
  - `ASR_EVALUATION.md` for ASR engine evaluation and selection.
  - `HARDWARE.md` outlining hardware requirements and purchase plans.
  - `IMPLEMENTATION_GUIDE.md` for Milestone 2 implementation steps.
  - `LLM_CAPACITY.md` assessing VRAM and context window limits.
  - `LLM_MODEL_SURVEY.md` surveying open-weight LLM models.
  - `LLM_USAGE_AND_COSTS.md` detailing LLM usage and operational costs.
  - `MCP_ARCHITECTURE.md` describing the Model Context Protocol architecture.
  - `MCP_IMPLEMENTATION_SUMMARY.md` summarizing MCP implementation status.

These updates provide comprehensive guidance for the next phases of development and ensure clarity in project documentation.
2026-01-05 23:44:16 -05:00

57 lines
1.4 KiB
Markdown

# Ticket: Select Work Agent Model (4080)
## Ticket Information
- **ID**: TICKET-019
- **Title**: Select Work Agent Model for 4080
- **Type**: Research
- **Priority**: High
- **Status**: Done
- **Track**: LLM Infra
- **Milestone**: Milestone 1 - Survey & Architecture
- **Created**: 2024-01-XX
## Description
Select the LLM model for work agent on 4080:
- Coding/research-optimized model
- Not used by family agent
- Suitable for 16GB VRAM with quantization
- Good function-calling support
## Acceptance Criteria
- [x] Work agent model selected: **Llama 3.1 70B Q4**
- [x] Quantization level chosen: **Q4 (4-bit)**
- [x] Rationale documented (see `docs/MODEL_SELECTION.md`)
- [x] Model file location specified
- [x] Performance characteristics documented
## Technical Details
Selection criteria:
- Coding capabilities (CodeLlama, DeepSeek Coder, etc.)
- Research/analysis capabilities
- Function calling support
- Context window size
- Quantization: Q4-Q6 for 16GB VRAM
## Dependencies
- TICKET-017 (model survey)
- TICKET-018 (capacity assessment)
## Related Files
- `docs/MODEL_SELECTION.md` (to be created)
## Notes
Separate from family agent model. Can be selected independently.
## Progress Log
- 2024-01-XX - Model selected: Llama 3.1 70B Q4
- 2024-01-XX - Rationale documented in `docs/MODEL_SELECTION.md`
- 2024-01-XX - Based on TICKET-017 (survey) and TICKET-018 (capacity assessment)