# Ticket: LLM Capacity Assessment ## Ticket Information - **ID**: TICKET-018 - **Title**: LLM Capacity Assessment - **Type**: Research - **Priority**: High - **Status**: Backlog - **Track**: LLM Infra - **Milestone**: Milestone 1 - Survey & Architecture - **Created**: 2024-01-XX ## Description Determine maximum context and parameter size: - Assess 16GB VRAM capacity (13B-24B comfortable with quantization) - Determine max context window for 4080 - Assess 1050 capacity (smaller models, limited context) - Document memory requirements ## Acceptance Criteria - [ ] VRAM capacity documented for 4080 - [ ] VRAM capacity documented for 1050 - [ ] Max context window determined - [ ] Model size limits documented - [ ] Memory requirements in architecture docs ## Technical Details Assessment should cover: - 4080: 16GB VRAM, Q4/Q5 quantization - 1050: 4GB VRAM, very small models - Context window: 4K, 8K, 16K, 32K options - Batch size and concurrency limits ## Dependencies - TICKET-017 (model survey) ## Related Files - `docs/LLM_CAPACITY.md` (to be created) - `ARCHITECTURE.md` ## Notes Critical for model selection. Should be done early.