- Added .cursorrules for project guidelines and context - Created README.md for project overview and goals - Established ARCHITECTURE.md for architectural documentation - Set up tickets directory with initial ticket management files - Included .gitignore to manage ignored files and directories This commit lays the foundation for the Atlas project, ensuring a clear structure for development and collaboration.
1.1 KiB
1.1 KiB
Ticket: LLM Capacity Assessment
Ticket Information
- ID: TICKET-018
- Title: LLM Capacity Assessment
- Type: Research
- Priority: High
- Status: Backlog
- Track: LLM Infra
- Milestone: Milestone 1 - Survey & Architecture
- Created: 2024-01-XX
Description
Determine maximum context and parameter size:
- Assess 16GB VRAM capacity (13B-24B comfortable with quantization)
- Determine max context window for 4080
- Assess 1050 capacity (smaller models, limited context)
- Document memory requirements
Acceptance Criteria
- VRAM capacity documented for 4080
- VRAM capacity documented for 1050
- Max context window determined
- Model size limits documented
- Memory requirements in architecture docs
Technical Details
Assessment should cover:
- 4080: 16GB VRAM, Q4/Q5 quantization
- 1050: 4GB VRAM, very small models
- Context window: 4K, 8K, 16K, 32K options
- Batch size and concurrency limits
Dependencies
- TICKET-017 (model survey)
Related Files
docs/LLM_CAPACITY.md(to be created)ARCHITECTURE.md
Notes
Critical for model selection. Should be done early.