ilia/atlas

ilia 4b9ffb5ddf docs: Update architecture and add new documentation for LLM and MCP

- Enhanced `ARCHITECTURE.md` with details on LLM models for work (Llama 3.1 70B Q4) and family agents (Phi-3 Mini 3.8B Q4).
- Introduced new documents:
  - `ASR_EVALUATION.md` for ASR engine evaluation and selection.
  - `HARDWARE.md` outlining hardware requirements and purchase plans.
  - `IMPLEMENTATION_GUIDE.md` for Milestone 2 implementation steps.
  - `LLM_CAPACITY.md` assessing VRAM and context window limits.
  - `LLM_MODEL_SURVEY.md` surveying open-weight LLM models.
  - `LLM_USAGE_AND_COSTS.md` detailing LLM usage and operational costs.
  - `MCP_ARCHITECTURE.md` describing the Model Context Protocol architecture.
  - `MCP_IMPLEMENTATION_SUMMARY.md` summarizing MCP implementation status.

These updates provide comprehensive guidance for the next phases of development and ensure clarity in project documentation.

2026-01-05 23:44:16 -05:00

1.6 KiB

Raw Blame History

Ticket: Survey Candidate Open-Weight Models

Ticket Information

ID: TICKET-017
Title: Survey Candidate Open-Weight Models
Type: Research
Priority: High
Status: In Progress
Track: LLM Infra
Milestone: Milestone 1 - Survey & Architecture
Created: 2024-01-XX

Description

Survey and evaluate open-weight LLM models:

8-14B and 30B quantized options for RTX 4080 (Q4-Q6 variants)
Small models for RTX 1050 (family agent)
Evaluate coding/research capabilities for work agent
Evaluate instruction-following for family agent

Acceptance Criteria

Model comparison matrix created
4080 model candidates identified (70B quantized, 33B alternatives)
1050 model candidates identified (3.8B, 1.5B, 1.1B options)
Evaluation criteria documented
Recommendations documented

Technical Details

Models to evaluate:

4080: Llama 3 8B/70B, Mistral 7B, Qwen, etc.
1050: TinyLlama, Phi-2, smaller quantized models
Quantization: Q4, Q5, Q6, Q8
Function calling support required

Dependencies

TICKET-004 (architecture) - helpful context

docs/LLM_MODEL_SURVEY.md (to be created)
ARCHITECTURE.md

Notes

Can start in parallel with wake-word and clients. Depends on high-level architecture doc.

Progress Log

2024-01-XX - Survey document created with comprehensive model analysis
2024-01-XX - Recommendations finalized:
- Work Agent (4080): Llama 3.1 70B Q4 (primary), DeepSeek Coder 33B Q4 (alternative)
- Family Agent (1050): Phi-3 Mini 3.8B Q4 (primary), Qwen2.5 1.5B Q4 (alternative)

1.6 KiB Raw Blame History

Ticket: Survey Candidate Open-Weight Models

Ticket Information

Description

Acceptance Criteria

Technical Details

Dependencies

Related Files

Notes

Progress Log

1.6 KiB

Raw Blame History