nanobot/WHY_HUGGINGFACE.md
Tanya e6b5ead3fd Merge origin/main into feature branch
- Merged latest 166 commits from origin/main
- Resolved conflicts in .gitignore, commands.py, schema.py, providers/__init__.py, and registry.py
- Kept both local providers (Ollama, AirLLM) and new providers from main
- Preserved transformers 4.39.3 compatibility fixes
- Combined error handling improvements with new features
2026-02-18 13:03:19 -05:00

149 lines
3.7 KiB
Markdown

# Why Hugging Face? Can We Avoid It?
## Short Answer
**You don't HAVE to use Hugging Face**, but it's the easiest way. Here's why it's commonly used and what alternatives exist.
## Why Hugging Face is Used
### 1. **Model Distribution Platform**
- Hugging Face Hub is where most open-source models (Llama, etc.) are hosted
- When you specify `"meta-llama/Llama-3.1-8B-Instruct"`, AirLLM automatically downloads it from Hugging Face
- It's the standard repository that everyone uses
### 2. **Gated Models (Like Llama)**
- Llama models are "gated" - they require:
- Accepting Meta's license terms
- A Hugging Face account
- A token to authenticate
- This is **Meta's requirement**, not Hugging Face's
- The token proves you've accepted the license
### 3. **Convenience**
- Automatic downloads
- Version management
- Easy model discovery
## Alternatives: How to Avoid Hugging Face
### Option 1: Use Local Model Files (No HF Token Needed!)
If you already have the model downloaded locally, you can use it directly:
**1. Download the model manually** (one-time, can use `git lfs` or `huggingface-cli`):
```bash
# Using huggingface-cli (still needs token, but only once)
huggingface-cli download meta-llama/Llama-3.1-8B-Instruct --local-dir ~/models/llama-3.1-8b
# Or using git lfs
git lfs clone https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct ~/models/llama-3.1-8b
```
**2. Use local path in config**:
```json
{
"providers": {
"airllm": {
"apiKey": "/home/youruser/models/llama-3.1-8b"
}
},
"agents": {
"defaults": {
"model": "/home/youruser/models/llama-3.1-8b"
}
}
}
```
**Note**: AirLLM's `AutoModel.from_pretrained()` accepts local paths! Just use the full path instead of the model ID.
### Option 2: Use Ollama (No HF at All!)
Ollama manages models for you and doesn't require Hugging Face:
**1. Install Ollama**: https://ollama.ai
**2. Pull a model**:
```bash
ollama pull llama3.1:8b
```
**3. Configure nanobot**:
```json
{
"providers": {
"ollama": {
"apiKey": "dummy",
"apiBase": "http://localhost:11434/v1"
}
},
"agents": {
"defaults": {
"model": "llama3.1:8b"
}
}
}
```
### Option 3: Use vLLM (Local Server)
**1. Download model once** (with or without HF token):
```bash
# With HF token
huggingface-cli download meta-llama/Llama-3.1-8B-Instruct --local-dir ~/models/llama-3.1-8b
# Or manually download from other sources
```
**2. Start vLLM server**:
```bash
vllm serve ~/models/llama-3.1-8b --port 8000
```
**3. Configure nanobot**:
```json
{
"providers": {
"vllm": {
"apiKey": "dummy",
"apiBase": "http://localhost:8000/v1"
}
},
"agents": {
"defaults": {
"model": "llama-3.1-8b"
}
}
}
```
## Why You Might Still Need HF Token
Even if you want to avoid Hugging Face long-term, you might need it **once** to:
- Download the model initially
- Accept the license for gated models (Llama)
After that, you can use the local files and never touch Hugging Face again!
## Recommendation
**For Llama models specifically:**
1. **Get HF token once** (5 minutes) - just to download and accept license
2. **Download model locally** - use `huggingface-cli` or `git lfs`
3. **Use local path** - configure nanobot to use the local directory
4. **Never need HF again** - the model runs completely offline
This gives you:
- ✅ No ongoing dependency on Hugging Face
- ✅ Faster startup (no downloads)
- ✅ Works offline
- ✅ Full control
## Summary
- **Hugging Face is required** for: Downloading models initially, accessing gated models
- **Hugging Face is NOT required** for: Running models after download, using local files, using Ollama/vLLM
- **Best approach**: Download once with HF token, then use local files forever