- Added .cursorrules for project guidelines and context - Created README.md for project overview and goals - Established ARCHITECTURE.md for architectural documentation - Set up tickets directory with initial ticket management files - Included .gitignore to manage ignored files and directories This commit lays the foundation for the Atlas project, ensuring a clear structure for development and collaboration.
1.2 KiB
1.2 KiB
Ticket: Stand Up 4080 LLM Service
Ticket Information
- ID: TICKET-021
- Title: Stand Up 4080 LLM Service
- Type: Feature
- Priority: High
- Status: Backlog
- Track: LLM Infra
- Milestone: Milestone 2 - Voice Chat MVP
- Created: 2024-01-XX
Description
Set up LLM service on 4080:
- Use Ollama/vLLM/llama.cpp-based server
- Expose HTTP/gRPC API
- Support function-calling/tool use
- Load selected work agent model
- Configure for optimal performance
Acceptance Criteria
- LLM server running on 4080
- HTTP/gRPC endpoint exposed
- Work agent model loaded
- Function-calling support working
- Basic health check endpoint
- Performance acceptable
Technical Details
Server options:
- Ollama: Easy setup, good tool support
- vLLM: High throughput, batching
- llama.cpp: Lightweight, efficient
Requirements:
- HTTP API for simple requests
- gRPC for streaming (optional)
- Function calling format (OpenAI-compatible)
Dependencies
- TICKET-019 (work agent model selection)
- TICKET-004 (architecture)
Related Files
home-voice-agent/llm-servers/4080/(to be created)
Notes
Independent of MCP/tool design - just needs common API. Can proceed after model selection.