# Setting Up Llama Models with AirLLM This guide will help you configure nanobot to use Llama models with AirLLM. ## Quick Setup Run the setup script: ```bash python3 setup_llama_airllm.py ``` The script will: 1. Create/update your `~/.nanobot/config.json` file 2. Configure Llama-3.2-3B-Instruct as the default model 3. Guide you through getting a Hugging Face token ## Manual Setup ### Step 1: Get a Hugging Face Token Llama models are "gated" (require license acceptance), so you need a Hugging Face token: 1. Go to: https://huggingface.co/settings/tokens 2. Click **"New token"** 3. Give it a name (e.g., "nanobot") 4. Select **"Read"** permission 5. Click **"Generate token"** 6. **Copy the token** (starts with `hf_...`) ### Step 2: Accept Llama License 1. Go to: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct 2. Click **"Agree and access repository"** 3. Accept the license terms ### Step 3: Configure nanobot Edit `~/.nanobot/config.json`: ```json { "providers": { "airllm": { "apiKey": "meta-llama/Llama-3.2-3B-Instruct", "extraHeaders": { "hf_token": "hf_YOUR_TOKEN_HERE" } } }, "agents": { "defaults": { "model": "meta-llama/Llama-3.2-3B-Instruct" } } } ``` Replace `hf_YOUR_TOKEN_HERE` with your actual Hugging Face token. ### Step 4: Test It ```bash nanobot agent -m "Hello, what is 2+5?" ``` ## Recommended Llama Models ### Small Models (Faster, Less Memory) - **Llama-3.2-3B-Instruct** (Recommended - fast, minimal memory) - Model: `meta-llama/Llama-3.2-3B-Instruct` - Best for limited GPU memory - **Llama-3.1-8B-Instruct** (Good balance of performance and speed) - Model: `meta-llama/Llama-3.1-8B-Instruct` - Good balance of performance and speed ## Why Llama with AirLLM? - **Excellent AirLLM Compatibility**: Llama models work very well with AirLLM's chunking mechanism - **Proven Stability**: Llama models have been tested extensively with AirLLM - **Good Performance**: Llama models provide excellent quality while working efficiently with AirLLM ## Troubleshooting ### "Model not found" error - Make sure you've accepted the Llama license on Hugging Face - Verify your HF token has read permissions - Check that the token is correctly set in `extraHeaders.hf_token` ### "Out of memory" error - Try a smaller model (Llama-3.2-3B-Instruct) - Use compression: set `apiBase` to `"4bit"` or `"8bit"` in the airllm config ### Still having issues? - Check the config file format is valid JSON - Verify file permissions: `chmod 600 ~/.nanobot/config.json` - Check logs for detailed error messages ## Config File Location - **Path**: `~/.nanobot/config.json` - **Permissions**: Should be `600` (read/write for owner only) - **Backup**: Always backup before editing!