atlas/tickets/backlog/TICKET-018_llm-capacity-assessment.md
ilia 7c633a02ed Initialize project structure with essential files and documentation
- Added .cursorrules for project guidelines and context
- Created README.md for project overview and goals
- Established ARCHITECTURE.md for architectural documentation
- Set up tickets directory with initial ticket management files
- Included .gitignore to manage ignored files and directories

This commit lays the foundation for the Atlas project, ensuring a clear structure for development and collaboration.
2026-01-05 20:09:44 -05:00

50 lines
1.1 KiB
Markdown

# Ticket: LLM Capacity Assessment
## Ticket Information
- **ID**: TICKET-018
- **Title**: LLM Capacity Assessment
- **Type**: Research
- **Priority**: High
- **Status**: Backlog
- **Track**: LLM Infra
- **Milestone**: Milestone 1 - Survey & Architecture
- **Created**: 2024-01-XX
## Description
Determine maximum context and parameter size:
- Assess 16GB VRAM capacity (13B-24B comfortable with quantization)
- Determine max context window for 4080
- Assess 1050 capacity (smaller models, limited context)
- Document memory requirements
## Acceptance Criteria
- [ ] VRAM capacity documented for 4080
- [ ] VRAM capacity documented for 1050
- [ ] Max context window determined
- [ ] Model size limits documented
- [ ] Memory requirements in architecture docs
## Technical Details
Assessment should cover:
- 4080: 16GB VRAM, Q4/Q5 quantization
- 1050: 4GB VRAM, very small models
- Context window: 4K, 8K, 16K, 32K options
- Batch size and concurrency limits
## Dependencies
- TICKET-017 (model survey)
## Related Files
- `docs/LLM_CAPACITY.md` (to be created)
- `ARCHITECTURE.md`
## Notes
Critical for model selection. Should be done early.