# Why Hugging Face? Can We Avoid It? ## Short Answer **You don't HAVE to use Hugging Face**, but it's the easiest way. Here's why it's commonly used and what alternatives exist. ## Why Hugging Face is Used ### 1. **Model Distribution Platform** - Hugging Face Hub is where most open-source models (Llama, etc.) are hosted - When you specify `"meta-llama/Llama-3.1-8B-Instruct"`, AirLLM automatically downloads it from Hugging Face - It's the standard repository that everyone uses ### 2. **Gated Models (Like Llama)** - Llama models are "gated" - they require: - Accepting Meta's license terms - A Hugging Face account - A token to authenticate - This is **Meta's requirement**, not Hugging Face's - The token proves you've accepted the license ### 3. **Convenience** - Automatic downloads - Version management - Easy model discovery ## Alternatives: How to Avoid Hugging Face ### Option 1: Use Local Model Files (No HF Token Needed!) If you already have the model downloaded locally, you can use it directly: **1. Download the model manually** (one-time, can use `git lfs` or `huggingface-cli`): ```bash # Using huggingface-cli (still needs token, but only once) huggingface-cli download meta-llama/Llama-3.1-8B-Instruct --local-dir ~/models/llama-3.1-8b # Or using git lfs git lfs clone https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct ~/models/llama-3.1-8b ``` **2. Use local path in config**: ```json { "providers": { "airllm": { "apiKey": "/home/youruser/models/llama-3.1-8b" } }, "agents": { "defaults": { "model": "/home/youruser/models/llama-3.1-8b" } } } ``` **Note**: AirLLM's `AutoModel.from_pretrained()` accepts local paths! Just use the full path instead of the model ID. ### Option 2: Use Ollama (No HF at All!) Ollama manages models for you and doesn't require Hugging Face: **1. Install Ollama**: https://ollama.ai **2. Pull a model**: ```bash ollama pull llama3.1:8b ``` **3. Configure nanobot**: ```json { "providers": { "ollama": { "apiKey": "dummy", "apiBase": "http://localhost:11434/v1" } }, "agents": { "defaults": { "model": "llama3.1:8b" } } } ``` ### Option 3: Use vLLM (Local Server) **1. Download model once** (with or without HF token): ```bash # With HF token huggingface-cli download meta-llama/Llama-3.1-8B-Instruct --local-dir ~/models/llama-3.1-8b # Or manually download from other sources ``` **2. Start vLLM server**: ```bash vllm serve ~/models/llama-3.1-8b --port 8000 ``` **3. Configure nanobot**: ```json { "providers": { "vllm": { "apiKey": "dummy", "apiBase": "http://localhost:8000/v1" } }, "agents": { "defaults": { "model": "llama-3.1-8b" } } } ``` ## Why You Might Still Need HF Token Even if you want to avoid Hugging Face long-term, you might need it **once** to: - Download the model initially - Accept the license for gated models (Llama) After that, you can use the local files and never touch Hugging Face again! ## Recommendation **For Llama models specifically:** 1. **Get HF token once** (5 minutes) - just to download and accept license 2. **Download model locally** - use `huggingface-cli` or `git lfs` 3. **Use local path** - configure nanobot to use the local directory 4. **Never need HF again** - the model runs completely offline This gives you: - ✅ No ongoing dependency on Hugging Face - ✅ Faster startup (no downloads) - ✅ Works offline - ✅ Full control ## Summary - **Hugging Face is required** for: Downloading models initially, accessing gated models - **Hugging Face is NOT required** for: Running models after download, using local files, using Ollama/vLLM - **Best approach**: Download once with HF token, then use local files forever