POTE/docs/07_deployment.md
ilia 204cd0e75b Initial commit: POTE Phase 1 complete
- PR1: Project scaffold, DB models, price loader
- PR2: Congressional trade ingestion (House Stock Watcher)
- PR3: Security enrichment + deployment infrastructure
- 37 passing tests, 87%+ coverage
- Docker + Proxmox deployment ready
- Complete documentation
- Works 100% offline with fixtures
2025-12-14 20:45:34 -05:00

9.8 KiB

Deployment Guide

Deployment Options

POTE can be deployed in several ways depending on your needs:

  1. Local Development (SQLite) - What you have now
  2. Single Server (PostgreSQL + cron jobs)
  3. Docker (Containerized, easy to move)
  4. Cloud (AWS/GCP/Azure with managed DB)

Option 1: Local Development (Current Setup)

You're already running this!

# Setup (done)
make install
source venv/bin/activate
make migrate

# Ingest data
python scripts/ingest_from_fixtures.py  # Offline
python scripts/fetch_congressional_trades.py --days 30  # With internet

# Query
python
>>> from pote.db import SessionLocal
>>> from pote.db.models import Official
>>> with SessionLocal() as session:
...     officials = session.query(Official).all()
...     print(f"Total officials: {len(officials)}")

Pros: Simple, fast, no costs
Cons: Local only, SQLite limitations for heavy queries


Option 2: Single Server with PostgreSQL

Setup PostgreSQL

# Install PostgreSQL (Ubuntu/Debian)
sudo apt update
sudo apt install postgresql postgresql-contrib

# Create database
sudo -u postgres psql
postgres=# CREATE DATABASE pote;
postgres=# CREATE USER poteuser WITH PASSWORD 'your_secure_password';
postgres=# GRANT ALL PRIVILEGES ON DATABASE pote TO poteuser;
postgres=# \q

Update Configuration

# Edit .env
DATABASE_URL=postgresql://poteuser:your_secure_password@localhost:5432/pote

# Run migrations
source venv/bin/activate
make migrate

Schedule Regular Ingestion

# Add to crontab: crontab -e

# Fetch trades daily at 6 AM
0 6 * * * cd /path/to/pote && /path/to/pote/venv/bin/python scripts/fetch_congressional_trades.py --days 7 >> /var/log/pote/trades.log 2>&1

# Enrich securities weekly on Sunday at 3 AM
0 3 * * 0 cd /path/to/pote && /path/to/pote/venv/bin/python scripts/enrich_securities.py >> /var/log/pote/enrich.log 2>&1

# Fetch prices for all tickers daily at 7 AM
0 7 * * * cd /path/to/pote && /path/to/pote/venv/bin/python scripts/update_all_prices.py >> /var/log/pote/prices.log 2>&1

Pros: Production-ready, full SQL features, scheduled jobs
Cons: Requires server management, PostgreSQL setup


Option 3: Docker Deployment

Create Dockerfile

# Dockerfile
FROM python:3.11-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    postgresql-client \
    && rm -rf /var/lib/apt/lists/*

# Copy project files
COPY pyproject.toml .
COPY src/ src/
COPY alembic/ alembic/
COPY alembic.ini .
COPY scripts/ scripts/

# Install Python dependencies
RUN pip install --no-cache-dir -e .

# Run migrations on startup
CMD ["sh", "-c", "alembic upgrade head && python scripts/fetch_congressional_trades.py --days 30"]

Docker Compose Setup

# docker-compose.yml
version: '3.8'

services:
  db:
    image: postgres:15
    environment:
      POSTGRES_DB: pote
      POSTGRES_USER: poteuser
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"

  pote:
    build: .
    environment:
      DATABASE_URL: postgresql://poteuser:${POSTGRES_PASSWORD}@db:5432/pote
      QUIVERQUANT_API_KEY: ${QUIVERQUANT_API_KEY}
      FMP_API_KEY: ${FMP_API_KEY}
    depends_on:
      - db
    volumes:
      - ./logs:/app/logs

  # Optional: FastAPI backend (Phase 3)
  api:
    build: .
    command: uvicorn pote.api.main:app --host 0.0.0.0 --port 8000
    environment:
      DATABASE_URL: postgresql://poteuser:${POSTGRES_PASSWORD}@db:5432/pote
    depends_on:
      - db
    ports:
      - "8000:8000"

volumes:
  postgres_data:

Deploy with Docker

# Create .env file
cat > .env << EOF
POSTGRES_PASSWORD=your_secure_password
DATABASE_URL=postgresql://poteuser:your_secure_password@db:5432/pote
QUIVERQUANT_API_KEY=
FMP_API_KEY=
EOF

# Build and run
docker-compose up -d

# Run migrations
docker-compose exec pote alembic upgrade head

# Ingest data
docker-compose exec pote python scripts/fetch_congressional_trades.py --days 30

# View logs
docker-compose logs -f pote

Pros: Portable, isolated, easy to deploy anywhere
Cons: Requires Docker knowledge, slightly more complex


Option 4: Cloud Deployment (AWS Example)

AWS Architecture

┌─────────────────┐
│   EC2 Instance  │
│   - Python app  │
│   - Cron jobs   │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   RDS (Postgres)│
│   - Managed DB  │
└─────────────────┘

Setup Steps

  1. Create RDS PostgreSQL Instance

    • Go to AWS RDS Console
    • Create PostgreSQL 15 database
    • Note endpoint: pote-db.xxxxx.us-east-1.rds.amazonaws.com
    • Security group: Allow port 5432 from EC2
  2. Launch EC2 Instance

    # SSH into EC2
    ssh -i your-key.pem ubuntu@your-ec2-ip
    
    # Install dependencies
    sudo apt update
    sudo apt install python3.11 python3-pip git
    
    # Clone repo
    git clone <your-repo>
    cd pote
    
    # Setup
    python3 -m venv venv
    source venv/bin/activate
    pip install -e .
    
    # Configure
    cat > .env << EOF
    DATABASE_URL=postgresql://poteuser:password@pote-db.xxxxx.us-east-1.rds.amazonaws.com:5432/pote
    EOF
    
    # Run migrations
    alembic upgrade head
    
    # Setup cron jobs
    crontab -e
    # (Add the cron jobs from Option 2)
    
  3. Optional: Use AWS Lambda for scheduled jobs

    • Package app as Lambda function
    • Use EventBridge to trigger daily
    • Cheaper for infrequent jobs

Pros: Scalable, managed database, reliable
Cons: Costs money (~$20-50/mo for small RDS + EC2)


Option 5: Fly.io / Railway / Render (Easiest Cloud)

Fly.io Example

# Install flyctl
curl -L https://fly.io/install.sh | sh

# Login
flyctl auth login

# Create fly.toml
cat > fly.toml << EOF
app = "pote-research"

[build]
  builder = "paketobuildpacks/builder:base"

[env]
  PORT = "8080"

[[services]]
  internal_port = 8080
  protocol = "tcp"

  [[services.ports]]
    port = 80

[postgres]
  app = "pote-db"
EOF

# Create Postgres
flyctl postgres create --name pote-db

# Deploy
flyctl deploy

# Set secrets
flyctl secrets set DATABASE_URL="postgres://..."

Pros: Simple, cheap ($5-10/mo), automated deployments
Cons: Limited control, may need to adapt code


Production Checklist

Before deploying to production:

Security

  • Change all default passwords
  • Use environment variables for secrets (never commit .env)
  • Enable SSL for database connections
  • Set up firewall rules (only allow necessary ports)
  • Use HTTPS if exposing API/dashboard

Reliability

  • Set up database backups (daily)
  • Configure logging (centralized if possible)
  • Monitor disk space (especially for SQLite)
  • Set up error alerts (email/Slack on failures)
  • Test recovery from backup

Performance

  • Index frequently queried columns (already done in models)
  • Use connection pooling for PostgreSQL
  • Cache frequently accessed data
  • Limit API rate if exposing publicly

Compliance

  • Review data retention policy
  • Add disclaimers to any UI ("research only, not advice")
  • Document data sources and update frequency
  • Keep audit logs of data ingestion

Monitoring & Logs

Basic Logging Setup

# Add to scripts/fetch_congressional_trades.py
import logging
from logging.handlers import RotatingFileHandler

# Create logs directory
os.makedirs("logs", exist_ok=True)

# Configure logging
handler = RotatingFileHandler(
    "logs/ingestion.log",
    maxBytes=10_000_000,  # 10 MB
    backupCount=5
)
handler.setFormatter(logging.Formatter(
    '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
))
logger = logging.getLogger()
logger.addHandler(handler)

Health Check Endpoint (Optional)

# Add to pote/api/main.py (when building API)
from fastapi import FastAPI

app = FastAPI()

@app.get("/health")
def health_check():
    from pote.db import SessionLocal
    from sqlalchemy import text
    
    try:
        with SessionLocal() as session:
            session.execute(text("SELECT 1"))
        return {"status": "ok", "database": "connected"}
    except Exception as e:
        return {"status": "error", "message": str(e)}

Cost Estimates (Monthly)

Option Cost Notes
Local Dev $0 SQLite, your machine
VPS (DigitalOcean, Linode) $5-12 Small droplet + managed Postgres
AWS (small) $20-50 t3.micro EC2 + db.t3.micro RDS
Fly.io / Railway $5-15 Hobby tier, managed
Docker on VPS $10-20 One droplet, Docker Compose

Free tier options:

  • Railway: Free tier available (limited hours)
  • Fly.io: Free tier available (limited resources)
  • Oracle Cloud: Always-free tier (ARM instances)

Next Steps After Deployment

  1. Verify ingestion: Check logs after first cron run
  2. Test queries: Ensure data is accessible
  3. Monitor growth: Database size, query performance
  4. Plan backups: Set up automated DB dumps
  5. Document access: How to query, who has access

For Phase 2 (Analytics), you'll add:

  • Scheduled jobs for computing returns
  • Clustering jobs (weekly/monthly)
  • Optional dashboard deployment

Quick Deploy (Railway Example)

Railway is probably the easiest for personal projects:

# Install Railway CLI
npm install -g @railway/cli

# Login
railway login

# Initialize
railway init

# Add PostgreSQL
railway add --database postgres

# Deploy
railway up

# Add environment variables via dashboard
# DATABASE_URL is auto-configured

Cost: ~$5/mo, scales automatically


See docs/05_dev_setup.md for local development details.