ilia/POTE

ilia 204cd0e75b Initial commit: POTE Phase 1 complete

- PR1: Project scaffold, DB models, price loader
- PR2: Congressional trade ingestion (House Stock Watcher)
- PR3: Security enrichment + deployment infrastructure
- 37 passing tests, 87%+ coverage
- Docker + Proxmox deployment ready
- Complete documentation
- Works 100% offline with fixtures

2025-12-14 20:45:34 -05:00

9.8 KiB

Raw Blame History

Deployment Guide

Deployment Options

POTE can be deployed in several ways depending on your needs:

Local Development (SQLite) - What you have now ✅
Single Server (PostgreSQL + cron jobs)
Docker (Containerized, easy to move)
Cloud (AWS/GCP/Azure with managed DB)

Option 1: Local Development (Current Setup) ✅

You're already running this!

# Setup (done)
make install
source venv/bin/activate
make migrate

# Ingest data
python scripts/ingest_from_fixtures.py  # Offline
python scripts/fetch_congressional_trades.py --days 30  # With internet

# Query
python
>>> from pote.db import SessionLocal
>>> from pote.db.models import Official
>>> with SessionLocal() as session:
...     officials = session.query(Official).all()
...     print(f"Total officials: {len(officials)}")

Pros: Simple, fast, no costs
Cons: Local only, SQLite limitations for heavy queries

Option 2: Single Server with PostgreSQL

Setup PostgreSQL

# Install PostgreSQL (Ubuntu/Debian)
sudo apt update
sudo apt install postgresql postgresql-contrib

# Create database
sudo -u postgres psql
postgres=# CREATE DATABASE pote;
postgres=# CREATE USER poteuser WITH PASSWORD 'your_secure_password';
postgres=# GRANT ALL PRIVILEGES ON DATABASE pote TO poteuser;
postgres=# \q

Update Configuration

# Edit .env
DATABASE_URL=postgresql://poteuser:your_secure_password@localhost:5432/pote

# Run migrations
source venv/bin/activate
make migrate

Schedule Regular Ingestion

# Add to crontab: crontab -e

# Fetch trades daily at 6 AM
0 6 * * * cd /path/to/pote && /path/to/pote/venv/bin/python scripts/fetch_congressional_trades.py --days 7 >> /var/log/pote/trades.log 2>&1

# Enrich securities weekly on Sunday at 3 AM
0 3 * * 0 cd /path/to/pote && /path/to/pote/venv/bin/python scripts/enrich_securities.py >> /var/log/pote/enrich.log 2>&1

# Fetch prices for all tickers daily at 7 AM
0 7 * * * cd /path/to/pote && /path/to/pote/venv/bin/python scripts/update_all_prices.py >> /var/log/pote/prices.log 2>&1

Pros: Production-ready, full SQL features, scheduled jobs
Cons: Requires server management, PostgreSQL setup

Option 3: Docker Deployment

Create Dockerfile

# Dockerfile
FROM python:3.11-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    postgresql-client \
    && rm -rf /var/lib/apt/lists/*

# Copy project files
COPY pyproject.toml .
COPY src/ src/
COPY alembic/ alembic/
COPY alembic.ini .
COPY scripts/ scripts/

# Install Python dependencies
RUN pip install --no-cache-dir -e .

# Run migrations on startup
CMD ["sh", "-c", "alembic upgrade head && python scripts/fetch_congressional_trades.py --days 30"]

Docker Compose Setup

# docker-compose.yml
version: '3.8'

services:
  db:
    image: postgres:15
    environment:
      POSTGRES_DB: pote
      POSTGRES_USER: poteuser
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"

  pote:
    build: .
    environment:
      DATABASE_URL: postgresql://poteuser:${POSTGRES_PASSWORD}@db:5432/pote
      QUIVERQUANT_API_KEY: ${QUIVERQUANT_API_KEY}
      FMP_API_KEY: ${FMP_API_KEY}
    depends_on:
      - db
    volumes:
      - ./logs:/app/logs

  # Optional: FastAPI backend (Phase 3)
  api:
    build: .
    command: uvicorn pote.api.main:app --host 0.0.0.0 --port 8000
    environment:
      DATABASE_URL: postgresql://poteuser:${POSTGRES_PASSWORD}@db:5432/pote
    depends_on:
      - db
    ports:
      - "8000:8000"

volumes:
  postgres_data:

Deploy with Docker

# Create .env file
cat > .env << EOF
POSTGRES_PASSWORD=your_secure_password
DATABASE_URL=postgresql://poteuser:your_secure_password@db:5432/pote
QUIVERQUANT_API_KEY=
FMP_API_KEY=
EOF

# Build and run
docker-compose up -d

# Run migrations
docker-compose exec pote alembic upgrade head

# Ingest data
docker-compose exec pote python scripts/fetch_congressional_trades.py --days 30

# View logs
docker-compose logs -f pote

Pros: Portable, isolated, easy to deploy anywhere
Cons: Requires Docker knowledge, slightly more complex

Option 4: Cloud Deployment (AWS Example)

AWS Architecture

┌─────────────────┐
│   EC2 Instance  │
│   - Python app  │
│   - Cron jobs   │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   RDS (Postgres)│
│   - Managed DB  │
└─────────────────┘

Setup Steps

Create RDS PostgreSQL Instance
- Go to AWS RDS Console
- Create PostgreSQL 15 database
- Note endpoint: pote-db.xxxxx.us-east-1.rds.amazonaws.com
- Security group: Allow port 5432 from EC2

Launch EC2 Instance

# SSH into EC2
ssh -i your-key.pem ubuntu@your-ec2-ip

# Install dependencies
sudo apt update
sudo apt install python3.11 python3-pip git

# Clone repo
git clone <your-repo>
cd pote

# Setup
python3 -m venv venv
source venv/bin/activate
pip install -e .

# Configure
cat > .env << EOF
DATABASE_URL=postgresql://poteuser:password@pote-db.xxxxx.us-east-1.rds.amazonaws.com:5432/pote
EOF

# Run migrations
alembic upgrade head

# Setup cron jobs
crontab -e
# (Add the cron jobs from Option 2)

Optional: Use AWS Lambda for scheduled jobs
- Package app as Lambda function
- Use EventBridge to trigger daily
- Cheaper for infrequent jobs

Pros: Scalable, managed database, reliable
Cons: Costs money (~$20-50/mo for small RDS + EC2)

Option 5: Fly.io / Railway / Render (Easiest Cloud)

Fly.io Example

# Install flyctl
curl -L https://fly.io/install.sh | sh

# Login
flyctl auth login

# Create fly.toml
cat > fly.toml << EOF
app = "pote-research"

[build]
  builder = "paketobuildpacks/builder:base"

[env]
  PORT = "8080"

[[services]]
  internal_port = 8080
  protocol = "tcp"

  [[services.ports]]
    port = 80

[postgres]
  app = "pote-db"
EOF

# Create Postgres
flyctl postgres create --name pote-db

# Deploy
flyctl deploy

# Set secrets
flyctl secrets set DATABASE_URL="postgres://..."

Pros: Simple, cheap ($5-10/mo), automated deployments
Cons: Limited control, may need to adapt code

Production Checklist

Before deploying to production:

Security

Change all default passwords
Use environment variables for secrets (never commit .env)
Enable SSL for database connections
Set up firewall rules (only allow necessary ports)
Use HTTPS if exposing API/dashboard

Reliability

Set up database backups (daily)
Configure logging (centralized if possible)
Monitor disk space (especially for SQLite)
Set up error alerts (email/Slack on failures)
Test recovery from backup

Performance

Index frequently queried columns (already done in models)
Use connection pooling for PostgreSQL
Cache frequently accessed data
Limit API rate if exposing publicly

Compliance

Review data retention policy
Add disclaimers to any UI ("research only, not advice")
Document data sources and update frequency
Keep audit logs of data ingestion

Monitoring & Logs

Basic Logging Setup

# Add to scripts/fetch_congressional_trades.py
import logging
from logging.handlers import RotatingFileHandler

# Create logs directory
os.makedirs("logs", exist_ok=True)

# Configure logging
handler = RotatingFileHandler(
    "logs/ingestion.log",
    maxBytes=10_000_000,  # 10 MB
    backupCount=5
)
handler.setFormatter(logging.Formatter(
    '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
))
logger = logging.getLogger()
logger.addHandler(handler)

Health Check Endpoint (Optional)

# Add to pote/api/main.py (when building API)
from fastapi import FastAPI

app = FastAPI()

@app.get("/health")
def health_check():
    from pote.db import SessionLocal
    from sqlalchemy import text
    
    try:
        with SessionLocal() as session:
            session.execute(text("SELECT 1"))
        return {"status": "ok", "database": "connected"}
    except Exception as e:
        return {"status": "error", "message": str(e)}

Cost Estimates (Monthly)

Option	Cost	Notes
Local Dev	$0	SQLite, your machine
VPS (DigitalOcean, Linode)	$5-12	Small droplet + managed Postgres
AWS (small)	$20-50	t3.micro EC2 + db.t3.micro RDS
Fly.io / Railway	$5-15	Hobby tier, managed
Docker on VPS	$10-20	One droplet, Docker Compose

Free tier options:

Railway: Free tier available (limited hours)
Fly.io: Free tier available (limited resources)
Oracle Cloud: Always-free tier (ARM instances)

Next Steps After Deployment

Verify ingestion: Check logs after first cron run
Test queries: Ensure data is accessible
Monitor growth: Database size, query performance
Plan backups: Set up automated DB dumps
Document access: How to query, who has access

For Phase 2 (Analytics), you'll add:

Scheduled jobs for computing returns
Clustering jobs (weekly/monthly)
Optional dashboard deployment

Quick Deploy (Railway Example)

Railway is probably the easiest for personal projects:

# Install Railway CLI
npm install -g @railway/cli

# Login
railway login

# Initialize
railway init

# Add PostgreSQL
railway add --database postgres

# Deploy
railway up

# Add environment variables via dashboard
# DATABASE_URL is auto-configured

Cost: ~$5/mo, scales automatically

See docs/05_dev_setup.md for local development details.

9.8 KiB Raw Blame History