- Merged latest 166 commits from origin/main - Resolved conflicts in .gitignore, commands.py, schema.py, providers/__init__.py, and registry.py - Kept both local providers (Ollama, AirLLM) and new providers from main - Preserved transformers 4.39.3 compatibility fixes - Combined error handling improvements with new features
106 lines
2.7 KiB
Markdown
106 lines
2.7 KiB
Markdown
# Setting Up Llama Models with AirLLM
|
|
|
|
This guide will help you configure nanobot to use Llama models with AirLLM.
|
|
|
|
## Quick Setup
|
|
|
|
Run the setup script:
|
|
|
|
```bash
|
|
python3 setup_llama_airllm.py
|
|
```
|
|
|
|
The script will:
|
|
1. Create/update your `~/.nanobot/config.json` file
|
|
2. Configure Llama-3.2-3B-Instruct as the default model
|
|
3. Guide you through getting a Hugging Face token
|
|
|
|
## Manual Setup
|
|
|
|
### Step 1: Get a Hugging Face Token
|
|
|
|
Llama models are "gated" (require license acceptance), so you need a Hugging Face token:
|
|
|
|
1. Go to: https://huggingface.co/settings/tokens
|
|
2. Click **"New token"**
|
|
3. Give it a name (e.g., "nanobot")
|
|
4. Select **"Read"** permission
|
|
5. Click **"Generate token"**
|
|
6. **Copy the token** (starts with `hf_...`)
|
|
|
|
### Step 2: Accept Llama License
|
|
|
|
1. Go to: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct
|
|
2. Click **"Agree and access repository"**
|
|
3. Accept the license terms
|
|
|
|
### Step 3: Configure nanobot
|
|
|
|
Edit `~/.nanobot/config.json`:
|
|
|
|
```json
|
|
{
|
|
"providers": {
|
|
"airllm": {
|
|
"apiKey": "meta-llama/Llama-3.2-3B-Instruct",
|
|
"extraHeaders": {
|
|
"hf_token": "hf_YOUR_TOKEN_HERE"
|
|
}
|
|
}
|
|
},
|
|
"agents": {
|
|
"defaults": {
|
|
"model": "meta-llama/Llama-3.2-3B-Instruct"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Replace `hf_YOUR_TOKEN_HERE` with your actual Hugging Face token.
|
|
|
|
### Step 4: Test It
|
|
|
|
```bash
|
|
nanobot agent -m "Hello, what is 2+5?"
|
|
```
|
|
|
|
## Recommended Llama Models
|
|
|
|
### Small Models (Faster, Less Memory)
|
|
- **Llama-3.2-3B-Instruct** (Recommended - fast, minimal memory)
|
|
- Model: `meta-llama/Llama-3.2-3B-Instruct`
|
|
- Best for limited GPU memory
|
|
|
|
- **Llama-3.1-8B-Instruct** (Good balance of performance and speed)
|
|
- Model: `meta-llama/Llama-3.1-8B-Instruct`
|
|
- Good balance of performance and speed
|
|
|
|
## Why Llama with AirLLM?
|
|
|
|
- **Excellent AirLLM Compatibility**: Llama models work very well with AirLLM's chunking mechanism
|
|
- **Proven Stability**: Llama models have been tested extensively with AirLLM
|
|
- **Good Performance**: Llama models provide excellent quality while working efficiently with AirLLM
|
|
|
|
## Troubleshooting
|
|
|
|
### "Model not found" error
|
|
- Make sure you've accepted the Llama license on Hugging Face
|
|
- Verify your HF token has read permissions
|
|
- Check that the token is correctly set in `extraHeaders.hf_token`
|
|
|
|
### "Out of memory" error
|
|
- Try a smaller model (Llama-3.2-3B-Instruct)
|
|
- Use compression: set `apiBase` to `"4bit"` or `"8bit"` in the airllm config
|
|
|
|
### Still having issues?
|
|
- Check the config file format is valid JSON
|
|
- Verify file permissions: `chmod 600 ~/.nanobot/config.json`
|
|
- Check logs for detailed error messages
|
|
|
|
## Config File Location
|
|
|
|
- **Path**: `~/.nanobot/config.json`
|
|
- **Permissions**: Should be `600` (read/write for owner only)
|
|
- **Backup**: Always backup before editing!
|
|
|