atlas/tickets/backlog/TICKET-022_setup-1050-llm-server.md

# Ticket: Stand Up 1050 LLM Service

## Ticket Information

- **ID**: TICKET-022
- **Title**: Stand Up 1050 LLM Service
- **Type**: Feature
- **Priority**: High
- **Status**: Backlog
- **Track**: LLM Infra
- **Milestone**: Milestone 2 - Voice Chat MVP
- **Created**: 2024-01-XX

## Description

Set up LLM service on 1050:
- Smaller model, lower concurrency
- Persistent process managed via systemd/docker
- Expose HTTP/gRPC API
- Support function-calling/tool use
- Load selected family agent model

## Acceptance Criteria

- [ ] LLM server running on 1050
- [ ] HTTP/gRPC endpoint exposed
- [ ] Family agent model loaded
- [ ] Function-calling support working
- [ ] Systemd/docker service configured
- [ ] Auto-restart on failure

## Technical Details

Server setup:
- Use llama.cpp or Ollama (lightweight)
- Systemd service for auto-start
- Docker option for isolation
- Lower concurrency (1-2 requests)
- Optimized for latency

## Dependencies

- TICKET-020 (family agent model selection)
- TICKET-004 (architecture)

## Related Files

- `home-voice-agent/llm-servers/1050/` (to be created)

## Notes

Optimized for always-on family agent. Lower resource usage than 4080 server.