- Merged latest 166 commits from origin/main - Resolved conflicts in .gitignore, commands.py, schema.py, providers/__init__.py, and registry.py - Kept both local providers (Ollama, AirLLM) and new providers from main - Preserved transformers 4.39.3 compatibility fixes - Combined error handling improvements with new features
149 lines
3.7 KiB
Markdown
149 lines
3.7 KiB
Markdown
# Why Hugging Face? Can We Avoid It?
|
|
|
|
## Short Answer
|
|
|
|
**You don't HAVE to use Hugging Face**, but it's the easiest way. Here's why it's commonly used and what alternatives exist.
|
|
|
|
## Why Hugging Face is Used
|
|
|
|
### 1. **Model Distribution Platform**
|
|
- Hugging Face Hub is where most open-source models (Llama, etc.) are hosted
|
|
- When you specify `"meta-llama/Llama-3.1-8B-Instruct"`, AirLLM automatically downloads it from Hugging Face
|
|
- It's the standard repository that everyone uses
|
|
|
|
### 2. **Gated Models (Like Llama)**
|
|
- Llama models are "gated" - they require:
|
|
- Accepting Meta's license terms
|
|
- A Hugging Face account
|
|
- A token to authenticate
|
|
- This is **Meta's requirement**, not Hugging Face's
|
|
- The token proves you've accepted the license
|
|
|
|
### 3. **Convenience**
|
|
- Automatic downloads
|
|
- Version management
|
|
- Easy model discovery
|
|
|
|
## Alternatives: How to Avoid Hugging Face
|
|
|
|
### Option 1: Use Local Model Files (No HF Token Needed!)
|
|
|
|
If you already have the model downloaded locally, you can use it directly:
|
|
|
|
**1. Download the model manually** (one-time, can use `git lfs` or `huggingface-cli`):
|
|
```bash
|
|
# Using huggingface-cli (still needs token, but only once)
|
|
huggingface-cli download meta-llama/Llama-3.1-8B-Instruct --local-dir ~/models/llama-3.1-8b
|
|
|
|
# Or using git lfs
|
|
git lfs clone https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct ~/models/llama-3.1-8b
|
|
```
|
|
|
|
**2. Use local path in config**:
|
|
```json
|
|
{
|
|
"providers": {
|
|
"airllm": {
|
|
"apiKey": "/home/youruser/models/llama-3.1-8b"
|
|
}
|
|
},
|
|
"agents": {
|
|
"defaults": {
|
|
"model": "/home/youruser/models/llama-3.1-8b"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Note**: AirLLM's `AutoModel.from_pretrained()` accepts local paths! Just use the full path instead of the model ID.
|
|
|
|
### Option 2: Use Ollama (No HF at All!)
|
|
|
|
Ollama manages models for you and doesn't require Hugging Face:
|
|
|
|
**1. Install Ollama**: https://ollama.ai
|
|
|
|
**2. Pull a model**:
|
|
```bash
|
|
ollama pull llama3.1:8b
|
|
```
|
|
|
|
**3. Configure nanobot**:
|
|
```json
|
|
{
|
|
"providers": {
|
|
"ollama": {
|
|
"apiKey": "dummy",
|
|
"apiBase": "http://localhost:11434/v1"
|
|
}
|
|
},
|
|
"agents": {
|
|
"defaults": {
|
|
"model": "llama3.1:8b"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Option 3: Use vLLM (Local Server)
|
|
|
|
**1. Download model once** (with or without HF token):
|
|
```bash
|
|
# With HF token
|
|
huggingface-cli download meta-llama/Llama-3.1-8B-Instruct --local-dir ~/models/llama-3.1-8b
|
|
|
|
# Or manually download from other sources
|
|
```
|
|
|
|
**2. Start vLLM server**:
|
|
```bash
|
|
vllm serve ~/models/llama-3.1-8b --port 8000
|
|
```
|
|
|
|
**3. Configure nanobot**:
|
|
```json
|
|
{
|
|
"providers": {
|
|
"vllm": {
|
|
"apiKey": "dummy",
|
|
"apiBase": "http://localhost:8000/v1"
|
|
}
|
|
},
|
|
"agents": {
|
|
"defaults": {
|
|
"model": "llama-3.1-8b"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
|
|
## Why You Might Still Need HF Token
|
|
|
|
Even if you want to avoid Hugging Face long-term, you might need it **once** to:
|
|
- Download the model initially
|
|
- Accept the license for gated models (Llama)
|
|
|
|
After that, you can use the local files and never touch Hugging Face again!
|
|
|
|
## Recommendation
|
|
|
|
**For Llama models specifically:**
|
|
1. **Get HF token once** (5 minutes) - just to download and accept license
|
|
2. **Download model locally** - use `huggingface-cli` or `git lfs`
|
|
3. **Use local path** - configure nanobot to use the local directory
|
|
4. **Never need HF again** - the model runs completely offline
|
|
|
|
This gives you:
|
|
- ✅ No ongoing dependency on Hugging Face
|
|
- ✅ Faster startup (no downloads)
|
|
- ✅ Works offline
|
|
- ✅ Full control
|
|
|
|
## Summary
|
|
|
|
- **Hugging Face is required** for: Downloading models initially, accessing gated models
|
|
- **Hugging Face is NOT required** for: Running models after download, using local files, using Ollama/vLLM
|
|
- **Best approach**: Download once with HF token, then use local files forever
|
|
|