- PR1: Project scaffold, DB models, price loader - PR2: Congressional trade ingestion (House Stock Watcher) - PR3: Security enrichment + deployment infrastructure - 37 passing tests, 87%+ coverage - Docker + Proxmox deployment ready - Complete documentation - Works 100% offline with fixtures
449 lines
9.8 KiB
Markdown
449 lines
9.8 KiB
Markdown
# Deployment Guide
|
|
|
|
## Deployment Options
|
|
|
|
POTE can be deployed in several ways depending on your needs:
|
|
|
|
1. **Local Development** (SQLite) - What you have now ✅
|
|
2. **Single Server** (PostgreSQL + cron jobs)
|
|
3. **Docker** (Containerized, easy to move)
|
|
4. **Cloud** (AWS/GCP/Azure with managed DB)
|
|
|
|
---
|
|
|
|
## Option 1: Local Development (Current Setup) ✅
|
|
|
|
**You're already running this!**
|
|
|
|
```bash
|
|
# Setup (done)
|
|
make install
|
|
source venv/bin/activate
|
|
make migrate
|
|
|
|
# Ingest data
|
|
python scripts/ingest_from_fixtures.py # Offline
|
|
python scripts/fetch_congressional_trades.py --days 30 # With internet
|
|
|
|
# Query
|
|
python
|
|
>>> from pote.db import SessionLocal
|
|
>>> from pote.db.models import Official
|
|
>>> with SessionLocal() as session:
|
|
... officials = session.query(Official).all()
|
|
... print(f"Total officials: {len(officials)}")
|
|
```
|
|
|
|
**Pros**: Simple, fast, no costs
|
|
**Cons**: Local only, SQLite limitations for heavy queries
|
|
|
|
---
|
|
|
|
## Option 2: Single Server with PostgreSQL
|
|
|
|
### Setup PostgreSQL
|
|
|
|
```bash
|
|
# Install PostgreSQL (Ubuntu/Debian)
|
|
sudo apt update
|
|
sudo apt install postgresql postgresql-contrib
|
|
|
|
# Create database
|
|
sudo -u postgres psql
|
|
postgres=# CREATE DATABASE pote;
|
|
postgres=# CREATE USER poteuser WITH PASSWORD 'your_secure_password';
|
|
postgres=# GRANT ALL PRIVILEGES ON DATABASE pote TO poteuser;
|
|
postgres=# \q
|
|
```
|
|
|
|
### Update Configuration
|
|
|
|
```bash
|
|
# Edit .env
|
|
DATABASE_URL=postgresql://poteuser:your_secure_password@localhost:5432/pote
|
|
|
|
# Run migrations
|
|
source venv/bin/activate
|
|
make migrate
|
|
```
|
|
|
|
### Schedule Regular Ingestion
|
|
|
|
```bash
|
|
# Add to crontab: crontab -e
|
|
|
|
# Fetch trades daily at 6 AM
|
|
0 6 * * * cd /path/to/pote && /path/to/pote/venv/bin/python scripts/fetch_congressional_trades.py --days 7 >> /var/log/pote/trades.log 2>&1
|
|
|
|
# Enrich securities weekly on Sunday at 3 AM
|
|
0 3 * * 0 cd /path/to/pote && /path/to/pote/venv/bin/python scripts/enrich_securities.py >> /var/log/pote/enrich.log 2>&1
|
|
|
|
# Fetch prices for all tickers daily at 7 AM
|
|
0 7 * * * cd /path/to/pote && /path/to/pote/venv/bin/python scripts/update_all_prices.py >> /var/log/pote/prices.log 2>&1
|
|
```
|
|
|
|
**Pros**: Production-ready, full SQL features, scheduled jobs
|
|
**Cons**: Requires server management, PostgreSQL setup
|
|
|
|
---
|
|
|
|
## Option 3: Docker Deployment
|
|
|
|
### Create Dockerfile
|
|
|
|
```dockerfile
|
|
# Dockerfile
|
|
FROM python:3.11-slim
|
|
|
|
WORKDIR /app
|
|
|
|
# Install system dependencies
|
|
RUN apt-get update && apt-get install -y \
|
|
gcc \
|
|
postgresql-client \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
# Copy project files
|
|
COPY pyproject.toml .
|
|
COPY src/ src/
|
|
COPY alembic/ alembic/
|
|
COPY alembic.ini .
|
|
COPY scripts/ scripts/
|
|
|
|
# Install Python dependencies
|
|
RUN pip install --no-cache-dir -e .
|
|
|
|
# Run migrations on startup
|
|
CMD ["sh", "-c", "alembic upgrade head && python scripts/fetch_congressional_trades.py --days 30"]
|
|
```
|
|
|
|
### Docker Compose Setup
|
|
|
|
```yaml
|
|
# docker-compose.yml
|
|
version: '3.8'
|
|
|
|
services:
|
|
db:
|
|
image: postgres:15
|
|
environment:
|
|
POSTGRES_DB: pote
|
|
POSTGRES_USER: poteuser
|
|
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
|
|
volumes:
|
|
- postgres_data:/var/lib/postgresql/data
|
|
ports:
|
|
- "5432:5432"
|
|
|
|
pote:
|
|
build: .
|
|
environment:
|
|
DATABASE_URL: postgresql://poteuser:${POSTGRES_PASSWORD}@db:5432/pote
|
|
QUIVERQUANT_API_KEY: ${QUIVERQUANT_API_KEY}
|
|
FMP_API_KEY: ${FMP_API_KEY}
|
|
depends_on:
|
|
- db
|
|
volumes:
|
|
- ./logs:/app/logs
|
|
|
|
# Optional: FastAPI backend (Phase 3)
|
|
api:
|
|
build: .
|
|
command: uvicorn pote.api.main:app --host 0.0.0.0 --port 8000
|
|
environment:
|
|
DATABASE_URL: postgresql://poteuser:${POSTGRES_PASSWORD}@db:5432/pote
|
|
depends_on:
|
|
- db
|
|
ports:
|
|
- "8000:8000"
|
|
|
|
volumes:
|
|
postgres_data:
|
|
```
|
|
|
|
### Deploy with Docker
|
|
|
|
```bash
|
|
# Create .env file
|
|
cat > .env << EOF
|
|
POSTGRES_PASSWORD=your_secure_password
|
|
DATABASE_URL=postgresql://poteuser:your_secure_password@db:5432/pote
|
|
QUIVERQUANT_API_KEY=
|
|
FMP_API_KEY=
|
|
EOF
|
|
|
|
# Build and run
|
|
docker-compose up -d
|
|
|
|
# Run migrations
|
|
docker-compose exec pote alembic upgrade head
|
|
|
|
# Ingest data
|
|
docker-compose exec pote python scripts/fetch_congressional_trades.py --days 30
|
|
|
|
# View logs
|
|
docker-compose logs -f pote
|
|
```
|
|
|
|
**Pros**: Portable, isolated, easy to deploy anywhere
|
|
**Cons**: Requires Docker knowledge, slightly more complex
|
|
|
|
---
|
|
|
|
## Option 4: Cloud Deployment (AWS Example)
|
|
|
|
### AWS Architecture
|
|
|
|
```
|
|
┌─────────────────┐
|
|
│ EC2 Instance │
|
|
│ - Python app │
|
|
│ - Cron jobs │
|
|
└────────┬────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ RDS (Postgres)│
|
|
│ - Managed DB │
|
|
└─────────────────┘
|
|
```
|
|
|
|
### Setup Steps
|
|
|
|
1. **Create RDS PostgreSQL Instance**
|
|
- Go to AWS RDS Console
|
|
- Create PostgreSQL 15 database
|
|
- Note endpoint: `pote-db.xxxxx.us-east-1.rds.amazonaws.com`
|
|
- Security group: Allow port 5432 from EC2
|
|
|
|
2. **Launch EC2 Instance**
|
|
```bash
|
|
# SSH into EC2
|
|
ssh -i your-key.pem ubuntu@your-ec2-ip
|
|
|
|
# Install dependencies
|
|
sudo apt update
|
|
sudo apt install python3.11 python3-pip git
|
|
|
|
# Clone repo
|
|
git clone <your-repo>
|
|
cd pote
|
|
|
|
# Setup
|
|
python3 -m venv venv
|
|
source venv/bin/activate
|
|
pip install -e .
|
|
|
|
# Configure
|
|
cat > .env << EOF
|
|
DATABASE_URL=postgresql://poteuser:password@pote-db.xxxxx.us-east-1.rds.amazonaws.com:5432/pote
|
|
EOF
|
|
|
|
# Run migrations
|
|
alembic upgrade head
|
|
|
|
# Setup cron jobs
|
|
crontab -e
|
|
# (Add the cron jobs from Option 2)
|
|
```
|
|
|
|
3. **Optional: Use AWS Lambda for scheduled jobs**
|
|
- Package app as Lambda function
|
|
- Use EventBridge to trigger daily
|
|
- Cheaper for infrequent jobs
|
|
|
|
**Pros**: Scalable, managed database, reliable
|
|
**Cons**: Costs money (~$20-50/mo for small RDS + EC2)
|
|
|
|
---
|
|
|
|
## Option 5: Fly.io / Railway / Render (Easiest Cloud)
|
|
|
|
### Fly.io Example
|
|
|
|
```bash
|
|
# Install flyctl
|
|
curl -L https://fly.io/install.sh | sh
|
|
|
|
# Login
|
|
flyctl auth login
|
|
|
|
# Create fly.toml
|
|
cat > fly.toml << EOF
|
|
app = "pote-research"
|
|
|
|
[build]
|
|
builder = "paketobuildpacks/builder:base"
|
|
|
|
[env]
|
|
PORT = "8080"
|
|
|
|
[[services]]
|
|
internal_port = 8080
|
|
protocol = "tcp"
|
|
|
|
[[services.ports]]
|
|
port = 80
|
|
|
|
[postgres]
|
|
app = "pote-db"
|
|
EOF
|
|
|
|
# Create Postgres
|
|
flyctl postgres create --name pote-db
|
|
|
|
# Deploy
|
|
flyctl deploy
|
|
|
|
# Set secrets
|
|
flyctl secrets set DATABASE_URL="postgres://..."
|
|
```
|
|
|
|
**Pros**: Simple, cheap ($5-10/mo), automated deployments
|
|
**Cons**: Limited control, may need to adapt code
|
|
|
|
---
|
|
|
|
## Production Checklist
|
|
|
|
Before deploying to production:
|
|
|
|
### Security
|
|
- [ ] Change all default passwords
|
|
- [ ] Use environment variables for secrets (never commit `.env`)
|
|
- [ ] Enable SSL for database connections
|
|
- [ ] Set up firewall rules (only allow necessary ports)
|
|
- [ ] Use HTTPS if exposing API/dashboard
|
|
|
|
### Reliability
|
|
- [ ] Set up database backups (daily)
|
|
- [ ] Configure logging (centralized if possible)
|
|
- [ ] Monitor disk space (especially for SQLite)
|
|
- [ ] Set up error alerts (email/Slack on failures)
|
|
- [ ] Test recovery from backup
|
|
|
|
### Performance
|
|
- [ ] Index frequently queried columns (already done in models)
|
|
- [ ] Use connection pooling for PostgreSQL
|
|
- [ ] Cache frequently accessed data
|
|
- [ ] Limit API rate if exposing publicly
|
|
|
|
### Compliance
|
|
- [ ] Review data retention policy
|
|
- [ ] Add disclaimers to any UI ("research only, not advice")
|
|
- [ ] Document data sources and update frequency
|
|
- [ ] Keep audit logs of data ingestion
|
|
|
|
---
|
|
|
|
## Monitoring & Logs
|
|
|
|
### Basic Logging Setup
|
|
|
|
```python
|
|
# Add to scripts/fetch_congressional_trades.py
|
|
import logging
|
|
from logging.handlers import RotatingFileHandler
|
|
|
|
# Create logs directory
|
|
os.makedirs("logs", exist_ok=True)
|
|
|
|
# Configure logging
|
|
handler = RotatingFileHandler(
|
|
"logs/ingestion.log",
|
|
maxBytes=10_000_000, # 10 MB
|
|
backupCount=5
|
|
)
|
|
handler.setFormatter(logging.Formatter(
|
|
'%(asctime)s [%(levelname)s] %(name)s: %(message)s'
|
|
))
|
|
logger = logging.getLogger()
|
|
logger.addHandler(handler)
|
|
```
|
|
|
|
### Health Check Endpoint (Optional)
|
|
|
|
```python
|
|
# Add to pote/api/main.py (when building API)
|
|
from fastapi import FastAPI
|
|
|
|
app = FastAPI()
|
|
|
|
@app.get("/health")
|
|
def health_check():
|
|
from pote.db import SessionLocal
|
|
from sqlalchemy import text
|
|
|
|
try:
|
|
with SessionLocal() as session:
|
|
session.execute(text("SELECT 1"))
|
|
return {"status": "ok", "database": "connected"}
|
|
except Exception as e:
|
|
return {"status": "error", "message": str(e)}
|
|
```
|
|
|
|
---
|
|
|
|
## Cost Estimates (Monthly)
|
|
|
|
| Option | Cost | Notes |
|
|
|--------|------|-------|
|
|
| **Local Dev** | $0 | SQLite, your machine |
|
|
| **VPS (DigitalOcean, Linode)** | $5-12 | Small droplet + managed Postgres |
|
|
| **AWS (small)** | $20-50 | t3.micro EC2 + db.t3.micro RDS |
|
|
| **Fly.io / Railway** | $5-15 | Hobby tier, managed |
|
|
| **Docker on VPS** | $10-20 | One droplet, Docker Compose |
|
|
|
|
**Free tier options**:
|
|
- Railway: Free tier available (limited hours)
|
|
- Fly.io: Free tier available (limited resources)
|
|
- Oracle Cloud: Always-free tier (ARM instances)
|
|
|
|
---
|
|
|
|
## Next Steps After Deployment
|
|
|
|
1. **Verify ingestion**: Check logs after first cron run
|
|
2. **Test queries**: Ensure data is accessible
|
|
3. **Monitor growth**: Database size, query performance
|
|
4. **Plan backups**: Set up automated DB dumps
|
|
5. **Document access**: How to query, who has access
|
|
|
|
For Phase 2 (Analytics), you'll add:
|
|
- Scheduled jobs for computing returns
|
|
- Clustering jobs (weekly/monthly)
|
|
- Optional dashboard deployment
|
|
|
|
---
|
|
|
|
## Quick Deploy (Railway Example)
|
|
|
|
Railway is probably the easiest for personal projects:
|
|
|
|
```bash
|
|
# Install Railway CLI
|
|
npm install -g @railway/cli
|
|
|
|
# Login
|
|
railway login
|
|
|
|
# Initialize
|
|
railway init
|
|
|
|
# Add PostgreSQL
|
|
railway add --database postgres
|
|
|
|
# Deploy
|
|
railway up
|
|
|
|
# Add environment variables via dashboard
|
|
# DATABASE_URL is auto-configured
|
|
```
|
|
|
|
**Cost**: ~$5/mo, scales automatically
|
|
|
|
---
|
|
|
|
See `docs/05_dev_setup.md` for local development details.
|
|
|