- Merged latest 166 commits from origin/main - Resolved conflicts in .gitignore, commands.py, schema.py, providers/__init__.py, and registry.py - Kept both local providers (Ollama, AirLLM) and new providers from main - Preserved transformers 4.39.3 compatibility fixes - Combined error handling improvements with new features
2.7 KiB
2.7 KiB
Setting Up Llama Models with AirLLM
This guide will help you configure nanobot to use Llama models with AirLLM.
Quick Setup
Run the setup script:
python3 setup_llama_airllm.py
The script will:
- Create/update your
~/.nanobot/config.jsonfile - Configure Llama-3.2-3B-Instruct as the default model
- Guide you through getting a Hugging Face token
Manual Setup
Step 1: Get a Hugging Face Token
Llama models are "gated" (require license acceptance), so you need a Hugging Face token:
- Go to: https://huggingface.co/settings/tokens
- Click "New token"
- Give it a name (e.g., "nanobot")
- Select "Read" permission
- Click "Generate token"
- Copy the token (starts with
hf_...)
Step 2: Accept Llama License
- Go to: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct
- Click "Agree and access repository"
- Accept the license terms
Step 3: Configure nanobot
Edit ~/.nanobot/config.json:
{
"providers": {
"airllm": {
"apiKey": "meta-llama/Llama-3.2-3B-Instruct",
"extraHeaders": {
"hf_token": "hf_YOUR_TOKEN_HERE"
}
}
},
"agents": {
"defaults": {
"model": "meta-llama/Llama-3.2-3B-Instruct"
}
}
}
Replace hf_YOUR_TOKEN_HERE with your actual Hugging Face token.
Step 4: Test It
nanobot agent -m "Hello, what is 2+5?"
Recommended Llama Models
Small Models (Faster, Less Memory)
-
Llama-3.2-3B-Instruct (Recommended - fast, minimal memory)
- Model:
meta-llama/Llama-3.2-3B-Instruct - Best for limited GPU memory
- Model:
-
Llama-3.1-8B-Instruct (Good balance of performance and speed)
- Model:
meta-llama/Llama-3.1-8B-Instruct - Good balance of performance and speed
- Model:
Why Llama with AirLLM?
- Excellent AirLLM Compatibility: Llama models work very well with AirLLM's chunking mechanism
- Proven Stability: Llama models have been tested extensively with AirLLM
- Good Performance: Llama models provide excellent quality while working efficiently with AirLLM
Troubleshooting
"Model not found" error
- Make sure you've accepted the Llama license on Hugging Face
- Verify your HF token has read permissions
- Check that the token is correctly set in
extraHeaders.hf_token
"Out of memory" error
- Try a smaller model (Llama-3.2-3B-Instruct)
- Use compression: set
apiBaseto"4bit"or"8bit"in the airllm config
Still having issues?
- Check the config file format is valid JSON
- Verify file permissions:
chmod 600 ~/.nanobot/config.json - Check logs for detailed error messages
Config File Location
- Path:
~/.nanobot/config.json - Permissions: Should be
600(read/write for owner only) - Backup: Always backup before editing!