From 3a89c1e6d2a74275d157e9cad26015e80db098d2 Mon Sep 17 00:00:00 2001 From: ilia Date: Mon, 15 Dec 2025 14:55:05 -0500 Subject: [PATCH] Add comprehensive automation system MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New Scripts: - scripts/daily_fetch.sh: Automated daily data updates * Fetches congressional trades (last 7 days) * Enriches securities (name, sector, industry) * Updates price data for all securities * Calculates returns and metrics * Logs everything to logs/ directory - scripts/setup_automation.sh: Interactive automation setup * Makes scripts executable * Creates log directories * Configures cron jobs (multiple schedule options) * Guides user through setup Documentation: - docs/10_automation.md: Complete automation guide * Explains disclosure timing (30-45 day legal lag) * Why daily updates are optimal (not hourly/real-time) * Cron job setup instructions * Systemd timer alternative * Email notifications (optional) * Monitoring and logging * Failure handling * Performance optimization Key Insights: ❌ No real-time data possible (STOCK Act = 30-45 day lag) ✅ Daily updates are optimal ✅ Automated via cron jobs ✅ Handles API failures gracefully ✅ Logs everything for debugging --- docs/10_automation.md | 509 ++++++++++++++++++++++++++++++++++++ scripts/daily_fetch.sh | 118 +++++++++ scripts/setup_automation.sh | 150 +++++++++++ 3 files changed, 777 insertions(+) create mode 100644 docs/10_automation.md create mode 100755 scripts/daily_fetch.sh create mode 100755 scripts/setup_automation.sh diff --git a/docs/10_automation.md b/docs/10_automation.md new file mode 100644 index 0000000..180a7fc --- /dev/null +++ b/docs/10_automation.md @@ -0,0 +1,509 @@ +# POTE Automation Guide +**Automated Data Collection & Updates** + +--- + +## ⏰ Understanding Disclosure Timing + +### **Reality Check: No Real-Time Data Exists** + +**Federal Law (STOCK Act):** +- 📅 Congress members have **30-45 days** to disclose trades +- 📅 Disclosures are filed as **Periodic Transaction Reports (PTRs)** +- 📅 Public databases update **after** filing (usually next day) +- 📅 **No real-time feed exists by design** + +**Example Timeline:** +``` +Jan 15, 2024 → Senator buys NVDA +Feb 15, 2024 → Disclosure filed (30 days later) +Feb 16, 2024 → Appears on House Stock Watcher +Feb 17, 2024 → Your system fetches it +``` + +### **Best Practice: Daily Updates** + +Since trades appear in batches (not continuously), **running once per day is optimal**: + +✅ **Daily (7 AM)** - Catches overnight filings +✅ **After market close** - Prices are final +✅ **Low server load** - Off-peak hours +❌ **Hourly** - Wasteful, no new data +❌ **Real-time** - Impossible, not how disclosures work + +--- + +## 🤖 Automated Setup Options + +### **Option 1: Cron Job (Linux/Proxmox) - Recommended** + +#### **Setup on Proxmox Container** + +```bash +# SSH to your container +ssh poteapp@10.0.10.95 + +# Edit crontab +crontab -e + +# Add this line (runs daily at 7 AM): +0 7 * * * /home/poteapp/pote/scripts/daily_fetch.sh + +# Or run twice daily (7 AM and 7 PM): +0 7,19 * * * /home/poteapp/pote/scripts/daily_fetch.sh + +# Save and exit +``` + +**What it does:** +- Fetches new congressional trades (last 7 days) +- Enriches any new securities (name, sector, industry) +- Updates price data for all securities +- Logs everything to `logs/daily_fetch_YYYYMMDD.log` + +**Check logs:** +```bash +tail -f ~/pote/logs/daily_fetch_$(date +%Y%m%d).log +``` + +--- + +### **Option 2: Systemd Timer (More Advanced)** + +For better logging and service management: + +#### **Create Service File** + +```bash +sudo nano /etc/systemd/system/pote-fetch.service +``` + +```ini +[Unit] +Description=POTE Daily Data Fetch +After=network.target postgresql.service + +[Service] +Type=oneshot +User=poteapp +WorkingDirectory=/home/poteapp/pote +ExecStart=/home/poteapp/pote/scripts/daily_fetch.sh +StandardOutput=journal +StandardError=journal + +[Install] +WantedBy=multi-user.target +``` + +#### **Create Timer File** + +```bash +sudo nano /etc/systemd/system/pote-fetch.timer +``` + +```ini +[Unit] +Description=POTE Daily Data Fetch Timer +Requires=pote-fetch.service + +[Timer] +OnCalendar=daily +OnCalendar=07:00 +Persistent=true + +[Install] +WantedBy=timers.target +``` + +#### **Enable and Start** + +```bash +sudo systemctl daemon-reload +sudo systemctl enable pote-fetch.timer +sudo systemctl start pote-fetch.timer + +# Check status +sudo systemctl status pote-fetch.timer +sudo systemctl list-timers + +# View logs +sudo journalctl -u pote-fetch.service -f +``` + +--- + +### **Option 3: Manual Script (For Testing)** + +Run manually whenever you want: + +```bash +cd /home/user/Documents/code/pote +./scripts/daily_fetch.sh +``` + +Or from anywhere: + +```bash +/home/user/Documents/code/pote/scripts/daily_fetch.sh +``` + +--- + +## 📊 What Gets Updated? + +### **1. Congressional Trades** +**Script:** `fetch_congressional_trades.py` +**Frequency:** Daily +**Fetches:** Last 7 days (catches late filings) +**API:** House Stock Watcher (when available) + +**Alternative sources:** +- Manual CSV import +- QuiverQuant API (paid) +- Capitol Trades (paid) + +### **2. Security Enrichment** +**Script:** `enrich_securities.py` +**Frequency:** Daily (only updates new tickers) +**Fetches:** Company name, sector, industry +**API:** yfinance (free) + +### **3. Price Data** +**Script:** `fetch_sample_prices.py` +**Frequency:** Daily +**Fetches:** Historical prices for all securities +**API:** yfinance (free) +**Smart:** Only fetches missing date ranges (efficient) + +### **4. Analytics (Optional)** +**Script:** `calculate_all_returns.py` +**Frequency:** Daily (or on-demand) +**Calculates:** Returns, alpha, performance metrics + +--- + +## ⚙️ Customizing the Schedule + +### **Different Frequencies** + +```bash +# Every 6 hours +0 */6 * * * /home/poteapp/pote/scripts/daily_fetch.sh + +# Twice daily (morning and evening) +0 7,19 * * * /home/poteapp/pote/scripts/daily_fetch.sh + +# Weekdays only (business days) +0 7 * * 1-5 /home/poteapp/pote/scripts/daily_fetch.sh + +# Once per week (Sunday at midnight) +0 0 * * 0 /home/poteapp/pote/scripts/daily_fetch.sh +``` + +### **Best Practice Recommendations** + +**For Active Research:** +- **Daily at 7 AM** (catches overnight filings) +- **Weekdays only** (Congress rarely files on weekends) + +**For Casual Tracking:** +- **Weekly** (Sunday night) +- **Bi-weekly** (1st and 15th) + +**For Development:** +- **Manual runs** (on-demand testing) + +--- + +## 📧 Email Notifications (Optional) + +### **Setup Email Alerts** + +Add to your cron job: + +```bash +# Install mail utility +sudo apt install mailutils + +# Add to crontab with email +MAILTO=your-email@example.com +0 7 * * * /home/poteapp/pote/scripts/daily_fetch.sh +``` + +### **Custom Email Script** + +Create `scripts/email_summary.py`: + +```python +#!/usr/bin/env python +"""Email daily summary of new trades.""" + +import smtplib +from email.mime.text import MIMEText +from email.mime.multipart import MIMEMultipart +from datetime import date, timedelta +from sqlalchemy import text +from pote.db import engine + +def get_new_trades(days=1): + """Get trades from last N days.""" + since = date.today() - timedelta(days=days) + + with engine.connect() as conn: + result = conn.execute(text(""" + SELECT o.name, s.ticker, t.side, t.transaction_date, t.value_min, t.value_max + FROM trades t + JOIN officials o ON t.official_id = o.id + JOIN securities s ON t.security_id = s.id + WHERE t.created_at >= :since + ORDER BY t.transaction_date DESC + """), {"since": since}) + + return result.fetchall() + +def send_email(to_email, trades): + """Send email summary.""" + if not trades: + print("No new trades to report") + return + + # Compose email + subject = f"POTE: {len(trades)} New Congressional Trades" + + body = f"

New Trades ({len(trades)})

\n" + body += "" + + for trade in trades: + name, ticker, side, date, vmin, vmax = trade + value = f"${vmin:,.0f}-${vmax:,.0f}" if vmax else f"${vmin:,.0f}+" + body += f"" + + body += "
OfficialTickerSideDateValue
{name}{ticker}{side}{date}{value}
" + + # Send email (configure SMTP settings) + msg = MIMEMultipart() + msg['From'] = "pote@yourserver.com" + msg['To'] = to_email + msg['Subject'] = subject + msg.attach(MIMEText(body, 'html')) + + # Configure your SMTP server + # server = smtplib.SMTP('smtp.gmail.com', 587) + # server.starttls() + # server.login("your-email@gmail.com", "your-password") + # server.send_message(msg) + # server.quit() + + print(f"Would send email to {to_email}") + +if __name__ == "__main__": + trades = get_new_trades(days=1) + send_email("your-email@example.com", trades) +``` + +Then add to `daily_fetch.sh`: + +```bash +# At the end of daily_fetch.sh +python scripts/email_summary.py +``` + +--- + +## 🔍 Monitoring & Logging + +### **Check Cron Job Status** + +```bash +# View cron jobs +crontab -l + +# Check if cron is running +sudo systemctl status cron + +# View cron logs +grep CRON /var/log/syslog | tail -20 +``` + +### **Check POTE Logs** + +```bash +# Today's log +tail -f ~/pote/logs/daily_fetch_$(date +%Y%m%d).log + +# All logs +ls -lh ~/pote/logs/ + +# Last 100 lines of latest log +tail -100 ~/pote/logs/daily_fetch_*.log | tail -100 +``` + +### **Log Rotation (Keep Disk Space Clean)** + +Add to `/etc/logrotate.d/pote`: + +``` +/home/poteapp/pote/logs/*.log { + daily + rotate 30 + compress + delaycompress + missingok + notifempty +} +``` + +--- + +## 🚨 Handling Failures + +### **What If House Stock Watcher Is Down?** + +The script is designed to continue even if one step fails: + +```bash +# Script continues and logs warnings +⚠️ WARNING: Failed to fetch congressional trades + This is likely because House Stock Watcher API is down + Continuing with other steps... +``` + +**Fallback options:** +1. **Manual import:** Use CSV import when API is down +2. **Alternative APIs:** QuiverQuant, Capitol Trades +3. **Check logs:** Review what failed and why + +### **Automatic Retry Logic** + +Edit `scripts/fetch_congressional_trades.py` to add retries: + +```python +import time +from requests.exceptions import RequestException + +MAX_RETRIES = 3 +RETRY_DELAY = 300 # 5 minutes + +for attempt in range(MAX_RETRIES): + try: + trades = client.fetch_recent_transactions(days=7) + break + except RequestException as e: + if attempt < MAX_RETRIES - 1: + logger.warning(f"Attempt {attempt+1} failed, retrying in {RETRY_DELAY}s...") + time.sleep(RETRY_DELAY) + else: + logger.error("All retry attempts failed") + raise +``` + +--- + +## 📈 Performance Optimization + +### **Batch Processing** + +For large datasets, fetch in batches: + +```bash +# Fetch trades in smaller date ranges +python scripts/fetch_congressional_trades.py --start-date 2024-01-01 --end-date 2024-01-31 +python scripts/fetch_congressional_trades.py --start-date 2024-02-01 --end-date 2024-02-29 +``` + +### **Parallel Processing** + +Use GNU Parallel for faster price fetching: + +```bash +# Install parallel +sudo apt install parallel + +# Fetch prices in parallel (4 at a time) +python -c "from pote.db import get_session; from pote.db.models import Security; +session = next(get_session()); +tickers = [s.ticker for s in session.query(Security).all()]; +print('\n'.join(tickers))" | \ +parallel -j 4 python scripts/fetch_prices_single.py {} +``` + +### **Database Indexing** + +Ensure indexes are created (already in migrations): + +```sql +CREATE INDEX IF NOT EXISTS ix_trades_transaction_date ON trades(transaction_date); +CREATE INDEX IF NOT EXISTS ix_prices_date ON prices(date); +CREATE INDEX IF NOT EXISTS ix_prices_security_id ON prices(security_id); +``` + +--- + +## 🎯 Recommended Setup + +### **For Proxmox Production:** + +```bash +# 1. Setup daily cron job +crontab -e +# Add: 0 7 * * * /home/poteapp/pote/scripts/daily_fetch.sh + +# 2. Enable log rotation +sudo nano /etc/logrotate.d/pote +# Add log rotation config + +# 3. Setup monitoring (optional) +python scripts/email_summary.py + +# 4. Test manually first +./scripts/daily_fetch.sh +``` + +### **For Local Development:** + +```bash +# Run manually when needed +./scripts/daily_fetch.sh + +# Or setup quick alias +echo "alias pote-update='~/Documents/code/pote/scripts/daily_fetch.sh'" >> ~/.bashrc +source ~/.bashrc + +# Then just run: +pote-update +``` + +--- + +## 📝 Summary + +### **Key Points:** + +1. **No real-time data exists** - Congressional trades have 30-45 day lag by law +2. **Daily updates are optimal** - Running hourly is wasteful +3. **Automated via cron** - Set it and forget it +4. **Handles failures gracefully** - Continues even if one API is down +5. **Logs everything** - Easy to monitor and debug + +### **Quick Setup:** + +```bash +# On Proxmox +crontab -e +# Add: 0 7 * * * /home/poteapp/pote/scripts/daily_fetch.sh + +# Test it +./scripts/daily_fetch.sh + +# Check logs +tail -f logs/daily_fetch_*.log +``` + +### **Data Freshness Expectations:** + +- **Best case:** Trades from yesterday (if official filed overnight) +- **Typical:** Trades from 30-45 days ago +- **Worst case:** Official filed late or hasn't filed yet + +**This is normal and expected** - you're working with disclosure data, not market data. + diff --git a/scripts/daily_fetch.sh b/scripts/daily_fetch.sh new file mode 100755 index 0000000..2aa7606 --- /dev/null +++ b/scripts/daily_fetch.sh @@ -0,0 +1,118 @@ +#!/bin/bash +# Daily POTE Data Update Script +# Run this once per day to fetch new trades and prices +# Recommended: 7 AM daily (after markets close and disclosures are filed) + +set -e # Exit on error + +# --- Configuration --- +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_DIR="$(dirname "$SCRIPT_DIR")" +LOG_DIR="${PROJECT_DIR}/logs" +LOG_FILE="${LOG_DIR}/daily_fetch_$(date +%Y%m%d).log" + +# Ensure log directory exists +mkdir -p "$LOG_DIR" + +# Redirect all output to log file +exec > >(tee -a "$LOG_FILE") 2>&1 + +echo "==========================================" +echo " POTE Daily Data Fetch" +echo " $(date)" +echo "==========================================" + +# Activate virtual environment +cd "$PROJECT_DIR" +source venv/bin/activate + +# --- Step 1: Fetch Congressional Trades --- +echo "" +echo "--- Step 1: Fetching Congressional Trades ---" +# Fetch last 7 days (to catch any late filings) +python scripts/fetch_congressional_trades.py --days 7 +TRADES_EXIT=$? + +if [ $TRADES_EXIT -ne 0 ]; then + echo "⚠️ WARNING: Failed to fetch congressional trades" + echo " This is likely because House Stock Watcher API is down" + echo " Continuing with other steps..." +fi + +# --- Step 2: Enrich Securities --- +echo "" +echo "--- Step 2: Enriching Securities ---" +# Add company names, sectors, industries for any new tickers +python scripts/enrich_securities.py +ENRICH_EXIT=$? + +if [ $ENRICH_EXIT -ne 0 ]; then + echo "⚠️ WARNING: Failed to enrich securities" +fi + +# --- Step 3: Fetch Price Data --- +echo "" +echo "--- Step 3: Fetching Price Data ---" +# Fetch prices for all securities +python scripts/fetch_sample_prices.py +PRICES_EXIT=$? + +if [ $PRICES_EXIT -ne 0 ]; then + echo "⚠️ WARNING: Failed to fetch price data" +fi + +# --- Step 4: Calculate Returns (Optional) --- +echo "" +echo "--- Step 4: Calculating Returns ---" +python scripts/calculate_all_returns.py --window 90 --limit 100 +CALC_EXIT=$? + +if [ $CALC_EXIT -ne 0 ]; then + echo "⚠️ WARNING: Failed to calculate returns" +fi + +# --- Summary --- +echo "" +echo "==========================================" +echo " Daily Fetch Complete" +echo " $(date)" +echo "==========================================" + +# Show quick stats +python << 'PYEOF' +from sqlalchemy import text +from pote.db import engine +from datetime import datetime + +print("\n📊 Current Database Stats:") +with engine.connect() as conn: + officials = conn.execute(text("SELECT COUNT(*) FROM officials")).scalar() + trades = conn.execute(text("SELECT COUNT(*) FROM trades")).scalar() + securities = conn.execute(text("SELECT COUNT(*) FROM securities")).scalar() + prices = conn.execute(text("SELECT COUNT(*) FROM prices")).scalar() + + print(f" Officials: {officials:,}") + print(f" Securities: {securities:,}") + print(f" Trades: {trades:,}") + print(f" Prices: {prices:,}") + + # Show most recent trade + result = conn.execute(text(""" + SELECT o.name, s.ticker, t.side, t.transaction_date + FROM trades t + JOIN officials o ON t.official_id = o.id + JOIN securities s ON t.security_id = s.id + ORDER BY t.transaction_date DESC + LIMIT 1 + """)).fetchone() + + if result: + print(f"\n📈 Most Recent Trade:") + print(f" {result[0]} - {result[2].upper()} {result[1]} on {result[3]}") + +print() +PYEOF + +# Exit with success (even if some steps warned) +exit 0 + diff --git a/scripts/setup_automation.sh b/scripts/setup_automation.sh new file mode 100755 index 0000000..dc0a4c9 --- /dev/null +++ b/scripts/setup_automation.sh @@ -0,0 +1,150 @@ +#!/bin/bash +# Setup Automation for POTE +# Run this once on your Proxmox container to enable daily updates + +set -e + +echo "==========================================" +echo " POTE Automation Setup" +echo "==========================================" + +# Detect if we're root or regular user +if [ "$EUID" -eq 0 ]; then + echo "⚠️ Running as root. Will setup for poteapp user." + TARGET_USER="poteapp" + TARGET_HOME="/home/poteapp" +else + TARGET_USER="$USER" + TARGET_HOME="$HOME" +fi + +POTE_DIR="${TARGET_HOME}/pote" + +# Check if POTE directory exists +if [ ! -d "$POTE_DIR" ]; then + echo "❌ Error: POTE directory not found at $POTE_DIR" + echo " Please clone the repository first." + exit 1 +fi + +echo "✅ Found POTE at: $POTE_DIR" + +# Make scripts executable +echo "" +echo "Making scripts executable..." +chmod +x "${POTE_DIR}/scripts/daily_fetch.sh" +chmod +x "${POTE_DIR}/scripts/fetch_congressional_trades.py" +chmod +x "${POTE_DIR}/scripts/enrich_securities.py" +chmod +x "${POTE_DIR}/scripts/fetch_sample_prices.py" + +# Create logs directory +echo "Creating logs directory..." +mkdir -p "${POTE_DIR}/logs" + +# Test the daily fetch script +echo "" +echo "Testing daily fetch script (dry run)..." +echo "This may take a few minutes..." +cd "$POTE_DIR" + +if [ "$EUID" -eq 0 ]; then + su - $TARGET_USER -c "cd ${POTE_DIR} && source venv/bin/activate && python --version" +else + source venv/bin/activate + python --version +fi + +# Setup cron job +echo "" +echo "==========================================" +echo " Cron Job Setup" +echo "==========================================" +echo "" +echo "Choose schedule:" +echo " 1) Daily at 7 AM (recommended)" +echo " 2) Twice daily (7 AM and 7 PM)" +echo " 3) Weekdays only at 7 AM" +echo " 4) Custom (I'll help you configure)" +echo " 5) Skip (manual setup)" +echo "" +read -p "Enter choice [1-5]: " choice + +CRON_LINE="" + +case $choice in + 1) + CRON_LINE="0 7 * * * ${POTE_DIR}/scripts/daily_fetch.sh" + ;; + 2) + CRON_LINE="0 7,19 * * * ${POTE_DIR}/scripts/daily_fetch.sh" + ;; + 3) + CRON_LINE="0 7 * * 1-5 ${POTE_DIR}/scripts/daily_fetch.sh" + ;; + 4) + echo "" + echo "Cron format: MIN HOUR DAY MONTH WEEKDAY" + echo "Examples:" + echo " 0 7 * * * = Daily at 7 AM" + echo " 0 */6 * * * = Every 6 hours" + echo " 0 0 * * 0 = Weekly on Sunday" + read -p "Enter cron schedule: " custom_schedule + CRON_LINE="${custom_schedule} ${POTE_DIR}/scripts/daily_fetch.sh" + ;; + 5) + echo "Skipping cron setup. You can add manually with:" + echo " crontab -e" + echo " Add: 0 7 * * * ${POTE_DIR}/scripts/daily_fetch.sh" + CRON_LINE="" + ;; + *) + echo "Invalid choice. Skipping cron setup." + CRON_LINE="" + ;; +esac + +if [ -n "$CRON_LINE" ]; then + echo "" + echo "Adding to crontab: $CRON_LINE" + + if [ "$EUID" -eq 0 ]; then + # Add as target user + (su - $TARGET_USER -c "crontab -l" 2>/dev/null || true; echo "$CRON_LINE") | \ + su - $TARGET_USER -c "crontab -" + else + # Add as current user + (crontab -l 2>/dev/null || true; echo "$CRON_LINE") | crontab - + fi + + echo "✅ Cron job added!" + echo "" + echo "View with: crontab -l" +fi + +# Summary +echo "" +echo "==========================================" +echo " Setup Complete!" +echo "==========================================" +echo "" +echo "📝 What was configured:" +echo " ✅ Scripts made executable" +echo " ✅ Logs directory created: ${POTE_DIR}/logs" +if [ -n "$CRON_LINE" ]; then + echo " ✅ Cron job scheduled" +fi +echo "" +echo "🧪 Test manually:" +echo " ${POTE_DIR}/scripts/daily_fetch.sh" +echo "" +echo "📊 View logs:" +echo " tail -f ${POTE_DIR}/logs/daily_fetch_\$(date +%Y%m%d).log" +echo "" +echo "⚙️ Manage cron:" +echo " crontab -l # View cron jobs" +echo " crontab -e # Edit cron jobs" +echo "" +echo "📚 Documentation:" +echo " ${POTE_DIR}/docs/10_automation.md" +echo "" +