nanobot/SETUP_LLAMA.md
Tanya e6b5ead3fd Merge origin/main into feature branch
- Merged latest 166 commits from origin/main
- Resolved conflicts in .gitignore, commands.py, schema.py, providers/__init__.py, and registry.py
- Kept both local providers (Ollama, AirLLM) and new providers from main
- Preserved transformers 4.39.3 compatibility fixes
- Combined error handling improvements with new features
2026-02-18 13:03:19 -05:00

2.7 KiB

Setting Up Llama Models with AirLLM

This guide will help you configure nanobot to use Llama models with AirLLM.

Quick Setup

Run the setup script:

python3 setup_llama_airllm.py

The script will:

  1. Create/update your ~/.nanobot/config.json file
  2. Configure Llama-3.2-3B-Instruct as the default model
  3. Guide you through getting a Hugging Face token

Manual Setup

Step 1: Get a Hugging Face Token

Llama models are "gated" (require license acceptance), so you need a Hugging Face token:

  1. Go to: https://huggingface.co/settings/tokens
  2. Click "New token"
  3. Give it a name (e.g., "nanobot")
  4. Select "Read" permission
  5. Click "Generate token"
  6. Copy the token (starts with hf_...)

Step 2: Accept Llama License

  1. Go to: https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct
  2. Click "Agree and access repository"
  3. Accept the license terms

Step 3: Configure nanobot

Edit ~/.nanobot/config.json:

{
  "providers": {
    "airllm": {
      "apiKey": "meta-llama/Llama-3.2-3B-Instruct",
      "extraHeaders": {
        "hf_token": "hf_YOUR_TOKEN_HERE"
      }
    }
  },
  "agents": {
    "defaults": {
      "model": "meta-llama/Llama-3.2-3B-Instruct"
    }
  }
}

Replace hf_YOUR_TOKEN_HERE with your actual Hugging Face token.

Step 4: Test It

nanobot agent -m "Hello, what is 2+5?"

Small Models (Faster, Less Memory)

  • Llama-3.2-3B-Instruct (Recommended - fast, minimal memory)

    • Model: meta-llama/Llama-3.2-3B-Instruct
    • Best for limited GPU memory
  • Llama-3.1-8B-Instruct (Good balance of performance and speed)

    • Model: meta-llama/Llama-3.1-8B-Instruct
    • Good balance of performance and speed

Why Llama with AirLLM?

  • Excellent AirLLM Compatibility: Llama models work very well with AirLLM's chunking mechanism
  • Proven Stability: Llama models have been tested extensively with AirLLM
  • Good Performance: Llama models provide excellent quality while working efficiently with AirLLM

Troubleshooting

"Model not found" error

  • Make sure you've accepted the Llama license on Hugging Face
  • Verify your HF token has read permissions
  • Check that the token is correctly set in extraHeaders.hf_token

"Out of memory" error

  • Try a smaller model (Llama-3.2-3B-Instruct)
  • Use compression: set apiBase to "4bit" or "8bit" in the airllm config

Still having issues?

  • Check the config file format is valid JSON
  • Verify file permissions: chmod 600 ~/.nanobot/config.json
  • Check logs for detailed error messages

Config File Location

  • Path: ~/.nanobot/config.json
  • Permissions: Should be 600 (read/write for owner only)
  • Backup: Always backup before editing!