nanobot/SETUP_LLAMA.md
Tanya e6b5ead3fd Merge origin/main into feature branch
- Merged latest 166 commits from origin/main
- Resolved conflicts in .gitignore, commands.py, schema.py, providers/__init__.py, and registry.py
- Kept both local providers (Ollama, AirLLM) and new providers from main
- Preserved transformers 4.39.3 compatibility fixes
- Combined error handling improvements with new features
2026-02-18 13:03:19 -05:00

106 lines
2.7 KiB
Markdown

# Setting Up Llama Models with AirLLM
This guide will help you configure nanobot to use Llama models with AirLLM.
## Quick Setup
Run the setup script:
```bash
python3 setup_llama_airllm.py
```
The script will:
1. Create/update your `~/.nanobot/config.json` file
2. Configure Llama-3.2-3B-Instruct as the default model
3. Guide you through getting a Hugging Face token
## Manual Setup
### Step 1: Get a Hugging Face Token
Llama models are "gated" (require license acceptance), so you need a Hugging Face token:
1. Go to: https://huggingface.co/settings/tokens
2. Click **"New token"**
3. Give it a name (e.g., "nanobot")
4. Select **"Read"** permission
5. Click **"Generate token"**
6. **Copy the token** (starts with `hf_...`)
### Step 2: Accept Llama License
1. Go to: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct
2. Click **"Agree and access repository"**
3. Accept the license terms
### Step 3: Configure nanobot
Edit `~/.nanobot/config.json`:
```json
{
"providers": {
"airllm": {
"apiKey": "meta-llama/Llama-3.2-3B-Instruct",
"extraHeaders": {
"hf_token": "hf_YOUR_TOKEN_HERE"
}
}
},
"agents": {
"defaults": {
"model": "meta-llama/Llama-3.2-3B-Instruct"
}
}
}
```
Replace `hf_YOUR_TOKEN_HERE` with your actual Hugging Face token.
### Step 4: Test It
```bash
nanobot agent -m "Hello, what is 2+5?"
```
## Recommended Llama Models
### Small Models (Faster, Less Memory)
- **Llama-3.2-3B-Instruct** (Recommended - fast, minimal memory)
- Model: `meta-llama/Llama-3.2-3B-Instruct`
- Best for limited GPU memory
- **Llama-3.1-8B-Instruct** (Good balance of performance and speed)
- Model: `meta-llama/Llama-3.1-8B-Instruct`
- Good balance of performance and speed
## Why Llama with AirLLM?
- **Excellent AirLLM Compatibility**: Llama models work very well with AirLLM's chunking mechanism
- **Proven Stability**: Llama models have been tested extensively with AirLLM
- **Good Performance**: Llama models provide excellent quality while working efficiently with AirLLM
## Troubleshooting
### "Model not found" error
- Make sure you've accepted the Llama license on Hugging Face
- Verify your HF token has read permissions
- Check that the token is correctly set in `extraHeaders.hf_token`
### "Out of memory" error
- Try a smaller model (Llama-3.2-3B-Instruct)
- Use compression: set `apiBase` to `"4bit"` or `"8bit"` in the airllm config
### Still having issues?
- Check the config file format is valid JSON
- Verify file permissions: `chmod 600 ~/.nanobot/config.json`
- Check logs for detailed error messages
## Config File Location
- **Path**: `~/.nanobot/config.json`
- **Permissions**: Should be `600` (read/write for owner only)
- **Backup**: Always backup before editing!