# 1050 LLM Server (Family Agent) LLM server for family agent running Phi-3 Mini 3.8B Q4 on RTX 1050. ## Setup ### Using Ollama (Recommended) ```bash # Install Ollama curl -fsSL https://ollama.com/install.sh | sh # Download model ollama pull phi3:mini-q4_0 # Start server ollama serve --host 0.0.0.0 # Runs on http://<1050-ip>:11434 ``` ## Configuration - **Model**: Phi-3 Mini 3.8B Q4 - **Context Window**: 8K tokens (practical limit) - **VRAM Usage**: ~2.5GB - **Concurrency**: 1-2 requests max ## API Ollama uses OpenAI-compatible API: ```bash curl http://<1050-ip>:11434/api/chat -d '{ "model": "phi3:mini-q4_0", "messages": [ {"role": "user", "content": "Hello"} ], "stream": false }' ``` ## Systemd Service See `ollama-1050.service` for systemd configuration.