- PR1: Project scaffold, DB models, price loader - PR2: Congressional trade ingestion (House Stock Watcher) - PR3: Security enrichment + deployment infrastructure - 37 passing tests, 87%+ coverage - Docker + Proxmox deployment ready - Complete documentation - Works 100% offline with fixtures
9.8 KiB
Deployment Guide
Deployment Options
POTE can be deployed in several ways depending on your needs:
- Local Development (SQLite) - What you have now ✅
- Single Server (PostgreSQL + cron jobs)
- Docker (Containerized, easy to move)
- Cloud (AWS/GCP/Azure with managed DB)
Option 1: Local Development (Current Setup) ✅
You're already running this!
# Setup (done)
make install
source venv/bin/activate
make migrate
# Ingest data
python scripts/ingest_from_fixtures.py # Offline
python scripts/fetch_congressional_trades.py --days 30 # With internet
# Query
python
>>> from pote.db import SessionLocal
>>> from pote.db.models import Official
>>> with SessionLocal() as session:
... officials = session.query(Official).all()
... print(f"Total officials: {len(officials)}")
Pros: Simple, fast, no costs
Cons: Local only, SQLite limitations for heavy queries
Option 2: Single Server with PostgreSQL
Setup PostgreSQL
# Install PostgreSQL (Ubuntu/Debian)
sudo apt update
sudo apt install postgresql postgresql-contrib
# Create database
sudo -u postgres psql
postgres=# CREATE DATABASE pote;
postgres=# CREATE USER poteuser WITH PASSWORD 'your_secure_password';
postgres=# GRANT ALL PRIVILEGES ON DATABASE pote TO poteuser;
postgres=# \q
Update Configuration
# Edit .env
DATABASE_URL=postgresql://poteuser:your_secure_password@localhost:5432/pote
# Run migrations
source venv/bin/activate
make migrate
Schedule Regular Ingestion
# Add to crontab: crontab -e
# Fetch trades daily at 6 AM
0 6 * * * cd /path/to/pote && /path/to/pote/venv/bin/python scripts/fetch_congressional_trades.py --days 7 >> /var/log/pote/trades.log 2>&1
# Enrich securities weekly on Sunday at 3 AM
0 3 * * 0 cd /path/to/pote && /path/to/pote/venv/bin/python scripts/enrich_securities.py >> /var/log/pote/enrich.log 2>&1
# Fetch prices for all tickers daily at 7 AM
0 7 * * * cd /path/to/pote && /path/to/pote/venv/bin/python scripts/update_all_prices.py >> /var/log/pote/prices.log 2>&1
Pros: Production-ready, full SQL features, scheduled jobs
Cons: Requires server management, PostgreSQL setup
Option 3: Docker Deployment
Create Dockerfile
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
postgresql-client \
&& rm -rf /var/lib/apt/lists/*
# Copy project files
COPY pyproject.toml .
COPY src/ src/
COPY alembic/ alembic/
COPY alembic.ini .
COPY scripts/ scripts/
# Install Python dependencies
RUN pip install --no-cache-dir -e .
# Run migrations on startup
CMD ["sh", "-c", "alembic upgrade head && python scripts/fetch_congressional_trades.py --days 30"]
Docker Compose Setup
# docker-compose.yml
version: '3.8'
services:
db:
image: postgres:15
environment:
POSTGRES_DB: pote
POSTGRES_USER: poteuser
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
pote:
build: .
environment:
DATABASE_URL: postgresql://poteuser:${POSTGRES_PASSWORD}@db:5432/pote
QUIVERQUANT_API_KEY: ${QUIVERQUANT_API_KEY}
FMP_API_KEY: ${FMP_API_KEY}
depends_on:
- db
volumes:
- ./logs:/app/logs
# Optional: FastAPI backend (Phase 3)
api:
build: .
command: uvicorn pote.api.main:app --host 0.0.0.0 --port 8000
environment:
DATABASE_URL: postgresql://poteuser:${POSTGRES_PASSWORD}@db:5432/pote
depends_on:
- db
ports:
- "8000:8000"
volumes:
postgres_data:
Deploy with Docker
# Create .env file
cat > .env << EOF
POSTGRES_PASSWORD=your_secure_password
DATABASE_URL=postgresql://poteuser:your_secure_password@db:5432/pote
QUIVERQUANT_API_KEY=
FMP_API_KEY=
EOF
# Build and run
docker-compose up -d
# Run migrations
docker-compose exec pote alembic upgrade head
# Ingest data
docker-compose exec pote python scripts/fetch_congressional_trades.py --days 30
# View logs
docker-compose logs -f pote
Pros: Portable, isolated, easy to deploy anywhere
Cons: Requires Docker knowledge, slightly more complex
Option 4: Cloud Deployment (AWS Example)
AWS Architecture
┌─────────────────┐
│ EC2 Instance │
│ - Python app │
│ - Cron jobs │
└────────┬────────┘
│
▼
┌─────────────────┐
│ RDS (Postgres)│
│ - Managed DB │
└─────────────────┘
Setup Steps
-
Create RDS PostgreSQL Instance
- Go to AWS RDS Console
- Create PostgreSQL 15 database
- Note endpoint:
pote-db.xxxxx.us-east-1.rds.amazonaws.com - Security group: Allow port 5432 from EC2
-
Launch EC2 Instance
# SSH into EC2 ssh -i your-key.pem ubuntu@your-ec2-ip # Install dependencies sudo apt update sudo apt install python3.11 python3-pip git # Clone repo git clone <your-repo> cd pote # Setup python3 -m venv venv source venv/bin/activate pip install -e . # Configure cat > .env << EOF DATABASE_URL=postgresql://poteuser:password@pote-db.xxxxx.us-east-1.rds.amazonaws.com:5432/pote EOF # Run migrations alembic upgrade head # Setup cron jobs crontab -e # (Add the cron jobs from Option 2) -
Optional: Use AWS Lambda for scheduled jobs
- Package app as Lambda function
- Use EventBridge to trigger daily
- Cheaper for infrequent jobs
Pros: Scalable, managed database, reliable
Cons: Costs money (~$20-50/mo for small RDS + EC2)
Option 5: Fly.io / Railway / Render (Easiest Cloud)
Fly.io Example
# Install flyctl
curl -L https://fly.io/install.sh | sh
# Login
flyctl auth login
# Create fly.toml
cat > fly.toml << EOF
app = "pote-research"
[build]
builder = "paketobuildpacks/builder:base"
[env]
PORT = "8080"
[[services]]
internal_port = 8080
protocol = "tcp"
[[services.ports]]
port = 80
[postgres]
app = "pote-db"
EOF
# Create Postgres
flyctl postgres create --name pote-db
# Deploy
flyctl deploy
# Set secrets
flyctl secrets set DATABASE_URL="postgres://..."
Pros: Simple, cheap ($5-10/mo), automated deployments
Cons: Limited control, may need to adapt code
Production Checklist
Before deploying to production:
Security
- Change all default passwords
- Use environment variables for secrets (never commit
.env) - Enable SSL for database connections
- Set up firewall rules (only allow necessary ports)
- Use HTTPS if exposing API/dashboard
Reliability
- Set up database backups (daily)
- Configure logging (centralized if possible)
- Monitor disk space (especially for SQLite)
- Set up error alerts (email/Slack on failures)
- Test recovery from backup
Performance
- Index frequently queried columns (already done in models)
- Use connection pooling for PostgreSQL
- Cache frequently accessed data
- Limit API rate if exposing publicly
Compliance
- Review data retention policy
- Add disclaimers to any UI ("research only, not advice")
- Document data sources and update frequency
- Keep audit logs of data ingestion
Monitoring & Logs
Basic Logging Setup
# Add to scripts/fetch_congressional_trades.py
import logging
from logging.handlers import RotatingFileHandler
# Create logs directory
os.makedirs("logs", exist_ok=True)
# Configure logging
handler = RotatingFileHandler(
"logs/ingestion.log",
maxBytes=10_000_000, # 10 MB
backupCount=5
)
handler.setFormatter(logging.Formatter(
'%(asctime)s [%(levelname)s] %(name)s: %(message)s'
))
logger = logging.getLogger()
logger.addHandler(handler)
Health Check Endpoint (Optional)
# Add to pote/api/main.py (when building API)
from fastapi import FastAPI
app = FastAPI()
@app.get("/health")
def health_check():
from pote.db import SessionLocal
from sqlalchemy import text
try:
with SessionLocal() as session:
session.execute(text("SELECT 1"))
return {"status": "ok", "database": "connected"}
except Exception as e:
return {"status": "error", "message": str(e)}
Cost Estimates (Monthly)
| Option | Cost | Notes |
|---|---|---|
| Local Dev | $0 | SQLite, your machine |
| VPS (DigitalOcean, Linode) | $5-12 | Small droplet + managed Postgres |
| AWS (small) | $20-50 | t3.micro EC2 + db.t3.micro RDS |
| Fly.io / Railway | $5-15 | Hobby tier, managed |
| Docker on VPS | $10-20 | One droplet, Docker Compose |
Free tier options:
- Railway: Free tier available (limited hours)
- Fly.io: Free tier available (limited resources)
- Oracle Cloud: Always-free tier (ARM instances)
Next Steps After Deployment
- Verify ingestion: Check logs after first cron run
- Test queries: Ensure data is accessible
- Monitor growth: Database size, query performance
- Plan backups: Set up automated DB dumps
- Document access: How to query, who has access
For Phase 2 (Analytics), you'll add:
- Scheduled jobs for computing returns
- Clustering jobs (weekly/monthly)
- Optional dashboard deployment
Quick Deploy (Railway Example)
Railway is probably the easiest for personal projects:
# Install Railway CLI
npm install -g @railway/cli
# Login
railway login
# Initialize
railway init
# Add PostgreSQL
railway add --database postgres
# Deploy
railway up
# Add environment variables via dashboard
# DATABASE_URL is auto-configured
Cost: ~$5/mo, scales automatically
See docs/05_dev_setup.md for local development details.