nanobot/SETUP_LLAMA.md

# Setting Up Llama Models with AirLLM

This guide will help you configure nanobot to use Llama models with AirLLM.

## Quick Setup

Run the setup script:

```bash
python3 setup_llama_airllm.py
```

The script will:
1. Create/update your `~/.nanobot/config.json` file
2. Configure Llama-3.2-3B-Instruct as the default model
3. Guide you through getting a Hugging Face token

## Manual Setup

### Step 1: Get a Hugging Face Token

Llama models are "gated" (require license acceptance), so you need a Hugging Face token:

1. Go to: https://huggingface.co/settings/tokens
2. Click **"New token"**
3. Give it a name (e.g., "nanobot")
4. Select **"Read"** permission
5. Click **"Generate token"**
6. **Copy the token** (starts with `hf_...`)

### Step 2: Accept Llama License

1. Go to: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct
2. Click **"Agree and access repository"**
3. Accept the license terms

### Step 3: Configure nanobot

Edit `~/.nanobot/config.json`:

```json
{
  "providers": {
    "airllm": {
      "apiKey": "meta-llama/Llama-3.2-3B-Instruct",
      "extraHeaders": {
        "hf_token": "hf_YOUR_TOKEN_HERE"
      }
    }
  },
  "agents": {
    "defaults": {
      "model": "meta-llama/Llama-3.2-3B-Instruct"
    }
  }
}
```

Replace `hf_YOUR_TOKEN_HERE` with your actual Hugging Face token.

### Step 4: Test It

```bash
nanobot agent -m "Hello, what is 2+5?"
```

## Recommended Llama Models

### Small Models (Faster, Less Memory)
- **Llama-3.2-3B-Instruct** (Recommended - fast, minimal memory)
  - Model: `meta-llama/Llama-3.2-3B-Instruct`
  - Best for limited GPU memory

- **Llama-3.1-8B-Instruct** (Good balance of performance and speed)
  - Model: `meta-llama/Llama-3.1-8B-Instruct`
  - Good balance of performance and speed

## Why Llama with AirLLM?

- **Excellent AirLLM Compatibility**: Llama models work very well with AirLLM's chunking mechanism
- **Proven Stability**: Llama models have been tested extensively with AirLLM
- **Good Performance**: Llama models provide excellent quality while working efficiently with AirLLM

## Troubleshooting

### "Model not found" error
- Make sure you've accepted the Llama license on Hugging Face
- Verify your HF token has read permissions
- Check that the token is correctly set in `extraHeaders.hf_token`

### "Out of memory" error
- Try a smaller model (Llama-3.2-3B-Instruct)
- Use compression: set `apiBase` to `"4bit"` or `"8bit"` in the airllm config

### Still having issues?
- Check the config file format is valid JSON
- Verify file permissions: `chmod 600 ~/.nanobot/config.json`
- Check logs for detailed error messages

## Config File Location

- **Path**: `~/.nanobot/config.json`
- **Permissions**: Should be `600` (read/write for owner only)
- **Backup**: Always backup before editing!