Features Added: ============== 📧 EMAIL REPORTING SYSTEM: - EmailReporter: Send reports via SMTP (Gmail, SendGrid, custom) - ReportGenerator: Generate daily/weekly summaries with HTML/text formatting - Configurable via .env (SMTP_HOST, SMTP_PORT, etc.) - Scripts: send_daily_report.py, send_weekly_report.py 🤖 AUTOMATED RUNS: - automated_daily_run.sh: Full daily ETL pipeline + reporting - automated_weekly_run.sh: Weekly pattern analysis + reports - setup_cron.sh: Interactive cron job setup (5-minute setup) - Logs saved to ~/logs/ with automatic cleanup 🔍 HEALTH CHECKS: - health_check.py: System health monitoring - Checks: DB connection, data freshness, counts, recent alerts - JSON output for programmatic use - Exit codes for monitoring integration 🚀 CI/CD PIPELINE: - .github/workflows/ci.yml: Full CI/CD pipeline - GitHub Actions / Gitea Actions compatible - Jobs: lint & test, security scan, dependency scan, Docker build - PostgreSQL service for integration tests - 93 tests passing in CI 📚 COMPREHENSIVE DOCUMENTATION: - AUTOMATION_QUICKSTART.md: 5-minute email setup guide - docs/12_automation_and_reporting.md: Full automation guide - Updated README.md with automation links - Deployment → Production workflow guide 🛠️ IMPROVEMENTS: - All shell scripts made executable - Environment variable examples in .env.example - Report logs saved with timestamps - 30-day log retention with auto-cleanup - Health checks can be scheduled via cron WHAT THIS ENABLES: ================== After deployment, users can: 1. Set up automated daily/weekly email reports (5 min) 2. Receive HTML+text emails with: - New trades, market alerts, suspicious timing - Weekly patterns, rankings, repeat offenders 3. Monitor system health automatically 4. Run full CI/CD pipeline on every commit 5. Deploy with confidence (tests + security scans) USAGE: ====== # One-time setup (on deployed server) ./scripts/setup_cron.sh # Or manually send reports python scripts/send_daily_report.py --to user@example.com python scripts/send_weekly_report.py --to user@example.com # Check system health python scripts/health_check.py See AUTOMATION_QUICKSTART.md for full instructions. 93 tests passing | Full CI/CD | Email reports ready
511 lines
11 KiB
Markdown
511 lines
11 KiB
Markdown
# POTE Automation Guide
|
|
**Automated Data Collection & Updates**
|
|
|
|
---
|
|
|
|
## ⏰ Understanding Disclosure Timing
|
|
|
|
### **Reality Check: No Real-Time Data Exists**
|
|
|
|
**Federal Law (STOCK Act):**
|
|
- 📅 Congress members have **30-45 days** to disclose trades
|
|
- 📅 Disclosures are filed as **Periodic Transaction Reports (PTRs)**
|
|
- 📅 Public databases update **after** filing (usually next day)
|
|
- 📅 **No real-time feed exists by design**
|
|
|
|
**Example Timeline:**
|
|
```
|
|
Jan 15, 2024 → Senator buys NVDA
|
|
Feb 15, 2024 → Disclosure filed (30 days later)
|
|
Feb 16, 2024 → Appears on House Stock Watcher
|
|
Feb 17, 2024 → Your system fetches it
|
|
```
|
|
|
|
### **Best Practice: Daily Updates**
|
|
|
|
Since trades appear in batches (not continuously), **running once per day is optimal**:
|
|
|
|
✅ **Daily (7 AM)** - Catches overnight filings
|
|
✅ **After market close** - Prices are final
|
|
✅ **Low server load** - Off-peak hours
|
|
❌ **Hourly** - Wasteful, no new data
|
|
❌ **Real-time** - Impossible, not how disclosures work
|
|
|
|
---
|
|
|
|
## 🤖 Automated Setup Options
|
|
|
|
### **Option 1: Cron Job (Linux/Proxmox) - Recommended**
|
|
|
|
#### **Setup on Proxmox Container**
|
|
|
|
```bash
|
|
# SSH to your container
|
|
ssh poteapp@10.0.10.95
|
|
|
|
# Edit crontab
|
|
crontab -e
|
|
|
|
# Add this line (runs daily at 7 AM):
|
|
0 7 * * * /home/poteapp/pote/scripts/daily_fetch.sh
|
|
|
|
# Or run twice daily (7 AM and 7 PM):
|
|
0 7,19 * * * /home/poteapp/pote/scripts/daily_fetch.sh
|
|
|
|
# Save and exit
|
|
```
|
|
|
|
**What it does:**
|
|
- Fetches new congressional trades (last 7 days)
|
|
- Enriches any new securities (name, sector, industry)
|
|
- Updates price data for all securities
|
|
- Logs everything to `logs/daily_fetch_YYYYMMDD.log`
|
|
|
|
**Check logs:**
|
|
```bash
|
|
tail -f ~/pote/logs/daily_fetch_$(date +%Y%m%d).log
|
|
```
|
|
|
|
---
|
|
|
|
### **Option 2: Systemd Timer (More Advanced)**
|
|
|
|
For better logging and service management:
|
|
|
|
#### **Create Service File**
|
|
|
|
```bash
|
|
sudo nano /etc/systemd/system/pote-fetch.service
|
|
```
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=POTE Daily Data Fetch
|
|
After=network.target postgresql.service
|
|
|
|
[Service]
|
|
Type=oneshot
|
|
User=poteapp
|
|
WorkingDirectory=/home/poteapp/pote
|
|
ExecStart=/home/poteapp/pote/scripts/daily_fetch.sh
|
|
StandardOutput=journal
|
|
StandardError=journal
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
```
|
|
|
|
#### **Create Timer File**
|
|
|
|
```bash
|
|
sudo nano /etc/systemd/system/pote-fetch.timer
|
|
```
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=POTE Daily Data Fetch Timer
|
|
Requires=pote-fetch.service
|
|
|
|
[Timer]
|
|
OnCalendar=daily
|
|
OnCalendar=07:00
|
|
Persistent=true
|
|
|
|
[Install]
|
|
WantedBy=timers.target
|
|
```
|
|
|
|
#### **Enable and Start**
|
|
|
|
```bash
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl enable pote-fetch.timer
|
|
sudo systemctl start pote-fetch.timer
|
|
|
|
# Check status
|
|
sudo systemctl status pote-fetch.timer
|
|
sudo systemctl list-timers
|
|
|
|
# View logs
|
|
sudo journalctl -u pote-fetch.service -f
|
|
```
|
|
|
|
---
|
|
|
|
### **Option 3: Manual Script (For Testing)**
|
|
|
|
Run manually whenever you want:
|
|
|
|
```bash
|
|
cd /home/user/Documents/code/pote
|
|
./scripts/daily_fetch.sh
|
|
```
|
|
|
|
Or from anywhere:
|
|
|
|
```bash
|
|
/home/user/Documents/code/pote/scripts/daily_fetch.sh
|
|
```
|
|
|
|
---
|
|
|
|
## 📊 What Gets Updated?
|
|
|
|
### **1. Congressional Trades**
|
|
**Script:** `fetch_congressional_trades.py`
|
|
**Frequency:** Daily
|
|
**Fetches:** Last 7 days (catches late filings)
|
|
**API:** House Stock Watcher (when available)
|
|
|
|
**Alternative sources:**
|
|
- Manual CSV import
|
|
- QuiverQuant API (paid)
|
|
- Capitol Trades (paid)
|
|
|
|
### **2. Security Enrichment**
|
|
**Script:** `enrich_securities.py`
|
|
**Frequency:** Daily (only updates new tickers)
|
|
**Fetches:** Company name, sector, industry
|
|
**API:** yfinance (free)
|
|
|
|
### **3. Price Data**
|
|
**Script:** `fetch_sample_prices.py`
|
|
**Frequency:** Daily
|
|
**Fetches:** Historical prices for all securities
|
|
**API:** yfinance (free)
|
|
**Smart:** Only fetches missing date ranges (efficient)
|
|
|
|
### **4. Analytics (Optional)**
|
|
**Script:** `calculate_all_returns.py`
|
|
**Frequency:** Daily (or on-demand)
|
|
**Calculates:** Returns, alpha, performance metrics
|
|
|
|
---
|
|
|
|
## ⚙️ Customizing the Schedule
|
|
|
|
### **Different Frequencies**
|
|
|
|
```bash
|
|
# Every 6 hours
|
|
0 */6 * * * /home/poteapp/pote/scripts/daily_fetch.sh
|
|
|
|
# Twice daily (morning and evening)
|
|
0 7,19 * * * /home/poteapp/pote/scripts/daily_fetch.sh
|
|
|
|
# Weekdays only (business days)
|
|
0 7 * * 1-5 /home/poteapp/pote/scripts/daily_fetch.sh
|
|
|
|
# Once per week (Sunday at midnight)
|
|
0 0 * * 0 /home/poteapp/pote/scripts/daily_fetch.sh
|
|
```
|
|
|
|
### **Best Practice Recommendations**
|
|
|
|
**For Active Research:**
|
|
- **Daily at 7 AM** (catches overnight filings)
|
|
- **Weekdays only** (Congress rarely files on weekends)
|
|
|
|
**For Casual Tracking:**
|
|
- **Weekly** (Sunday night)
|
|
- **Bi-weekly** (1st and 15th)
|
|
|
|
**For Development:**
|
|
- **Manual runs** (on-demand testing)
|
|
|
|
---
|
|
|
|
## 📧 Email Notifications (Optional)
|
|
|
|
### **Setup Email Alerts**
|
|
|
|
Add to your cron job:
|
|
|
|
```bash
|
|
# Install mail utility
|
|
sudo apt install mailutils
|
|
|
|
# Add to crontab with email
|
|
MAILTO=your-email@example.com
|
|
0 7 * * * /home/poteapp/pote/scripts/daily_fetch.sh
|
|
```
|
|
|
|
### **Custom Email Script**
|
|
|
|
Create `scripts/email_summary.py`:
|
|
|
|
```python
|
|
#!/usr/bin/env python
|
|
"""Email daily summary of new trades."""
|
|
|
|
import smtplib
|
|
from email.mime.text import MIMEText
|
|
from email.mime.multipart import MIMEMultipart
|
|
from datetime import date, timedelta
|
|
from sqlalchemy import text
|
|
from pote.db import engine
|
|
|
|
def get_new_trades(days=1):
|
|
"""Get trades from last N days."""
|
|
since = date.today() - timedelta(days=days)
|
|
|
|
with engine.connect() as conn:
|
|
result = conn.execute(text("""
|
|
SELECT o.name, s.ticker, t.side, t.transaction_date, t.value_min, t.value_max
|
|
FROM trades t
|
|
JOIN officials o ON t.official_id = o.id
|
|
JOIN securities s ON t.security_id = s.id
|
|
WHERE t.created_at >= :since
|
|
ORDER BY t.transaction_date DESC
|
|
"""), {"since": since})
|
|
|
|
return result.fetchall()
|
|
|
|
def send_email(to_email, trades):
|
|
"""Send email summary."""
|
|
if not trades:
|
|
print("No new trades to report")
|
|
return
|
|
|
|
# Compose email
|
|
subject = f"POTE: {len(trades)} New Congressional Trades"
|
|
|
|
body = f"<h2>New Trades ({len(trades)})</h2>\n<table>"
|
|
body += "<tr><th>Official</th><th>Ticker</th><th>Side</th><th>Date</th><th>Value</th></tr>"
|
|
|
|
for trade in trades:
|
|
name, ticker, side, date, vmin, vmax = trade
|
|
value = f"${vmin:,.0f}-${vmax:,.0f}" if vmax else f"${vmin:,.0f}+"
|
|
body += f"<tr><td>{name}</td><td>{ticker}</td><td>{side}</td><td>{date}</td><td>{value}</td></tr>"
|
|
|
|
body += "</table>"
|
|
|
|
# Send email (configure SMTP settings)
|
|
msg = MIMEMultipart()
|
|
msg['From'] = "pote@yourserver.com"
|
|
msg['To'] = to_email
|
|
msg['Subject'] = subject
|
|
msg.attach(MIMEText(body, 'html'))
|
|
|
|
# Configure your SMTP server
|
|
# server = smtplib.SMTP('smtp.gmail.com', 587)
|
|
# server.starttls()
|
|
# server.login("your-email@gmail.com", "your-password")
|
|
# server.send_message(msg)
|
|
# server.quit()
|
|
|
|
print(f"Would send email to {to_email}")
|
|
|
|
if __name__ == "__main__":
|
|
trades = get_new_trades(days=1)
|
|
send_email("your-email@example.com", trades)
|
|
```
|
|
|
|
Then add to `daily_fetch.sh`:
|
|
|
|
```bash
|
|
# At the end of daily_fetch.sh
|
|
python scripts/email_summary.py
|
|
```
|
|
|
|
---
|
|
|
|
## 🔍 Monitoring & Logging
|
|
|
|
### **Check Cron Job Status**
|
|
|
|
```bash
|
|
# View cron jobs
|
|
crontab -l
|
|
|
|
# Check if cron is running
|
|
sudo systemctl status cron
|
|
|
|
# View cron logs
|
|
grep CRON /var/log/syslog | tail -20
|
|
```
|
|
|
|
### **Check POTE Logs**
|
|
|
|
```bash
|
|
# Today's log
|
|
tail -f ~/pote/logs/daily_fetch_$(date +%Y%m%d).log
|
|
|
|
# All logs
|
|
ls -lh ~/pote/logs/
|
|
|
|
# Last 100 lines of latest log
|
|
tail -100 ~/pote/logs/daily_fetch_*.log | tail -100
|
|
```
|
|
|
|
### **Log Rotation (Keep Disk Space Clean)**
|
|
|
|
Add to `/etc/logrotate.d/pote`:
|
|
|
|
```
|
|
/home/poteapp/pote/logs/*.log {
|
|
daily
|
|
rotate 30
|
|
compress
|
|
delaycompress
|
|
missingok
|
|
notifempty
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 🚨 Handling Failures
|
|
|
|
### **What If House Stock Watcher Is Down?**
|
|
|
|
The script is designed to continue even if one step fails:
|
|
|
|
```bash
|
|
# Script continues and logs warnings
|
|
⚠️ WARNING: Failed to fetch congressional trades
|
|
This is likely because House Stock Watcher API is down
|
|
Continuing with other steps...
|
|
```
|
|
|
|
**Fallback options:**
|
|
1. **Manual import:** Use CSV import when API is down
|
|
2. **Alternative APIs:** QuiverQuant, Capitol Trades
|
|
3. **Check logs:** Review what failed and why
|
|
|
|
### **Automatic Retry Logic**
|
|
|
|
Edit `scripts/fetch_congressional_trades.py` to add retries:
|
|
|
|
```python
|
|
import time
|
|
from requests.exceptions import RequestException
|
|
|
|
MAX_RETRIES = 3
|
|
RETRY_DELAY = 300 # 5 minutes
|
|
|
|
for attempt in range(MAX_RETRIES):
|
|
try:
|
|
trades = client.fetch_recent_transactions(days=7)
|
|
break
|
|
except RequestException as e:
|
|
if attempt < MAX_RETRIES - 1:
|
|
logger.warning(f"Attempt {attempt+1} failed, retrying in {RETRY_DELAY}s...")
|
|
time.sleep(RETRY_DELAY)
|
|
else:
|
|
logger.error("All retry attempts failed")
|
|
raise
|
|
```
|
|
|
|
---
|
|
|
|
## 📈 Performance Optimization
|
|
|
|
### **Batch Processing**
|
|
|
|
For large datasets, fetch in batches:
|
|
|
|
```bash
|
|
# Fetch trades in smaller date ranges
|
|
python scripts/fetch_congressional_trades.py --start-date 2024-01-01 --end-date 2024-01-31
|
|
python scripts/fetch_congressional_trades.py --start-date 2024-02-01 --end-date 2024-02-29
|
|
```
|
|
|
|
### **Parallel Processing**
|
|
|
|
Use GNU Parallel for faster price fetching:
|
|
|
|
```bash
|
|
# Install parallel
|
|
sudo apt install parallel
|
|
|
|
# Fetch prices in parallel (4 at a time)
|
|
python -c "from pote.db import get_session; from pote.db.models import Security;
|
|
session = next(get_session());
|
|
tickers = [s.ticker for s in session.query(Security).all()];
|
|
print('\n'.join(tickers))" | \
|
|
parallel -j 4 python scripts/fetch_prices_single.py {}
|
|
```
|
|
|
|
### **Database Indexing**
|
|
|
|
Ensure indexes are created (already in migrations):
|
|
|
|
```sql
|
|
CREATE INDEX IF NOT EXISTS ix_trades_transaction_date ON trades(transaction_date);
|
|
CREATE INDEX IF NOT EXISTS ix_prices_date ON prices(date);
|
|
CREATE INDEX IF NOT EXISTS ix_prices_security_id ON prices(security_id);
|
|
```
|
|
|
|
---
|
|
|
|
## 🎯 Recommended Setup
|
|
|
|
### **For Proxmox Production:**
|
|
|
|
```bash
|
|
# 1. Setup daily cron job
|
|
crontab -e
|
|
# Add: 0 7 * * * /home/poteapp/pote/scripts/daily_fetch.sh
|
|
|
|
# 2. Enable log rotation
|
|
sudo nano /etc/logrotate.d/pote
|
|
# Add log rotation config
|
|
|
|
# 3. Setup monitoring (optional)
|
|
python scripts/email_summary.py
|
|
|
|
# 4. Test manually first
|
|
./scripts/daily_fetch.sh
|
|
```
|
|
|
|
### **For Local Development:**
|
|
|
|
```bash
|
|
# Run manually when needed
|
|
./scripts/daily_fetch.sh
|
|
|
|
# Or setup quick alias
|
|
echo "alias pote-update='~/Documents/code/pote/scripts/daily_fetch.sh'" >> ~/.bashrc
|
|
source ~/.bashrc
|
|
|
|
# Then just run:
|
|
pote-update
|
|
```
|
|
|
|
---
|
|
|
|
## 📝 Summary
|
|
|
|
### **Key Points:**
|
|
|
|
1. **No real-time data exists** - Congressional trades have 30-45 day lag by law
|
|
2. **Daily updates are optimal** - Running hourly is wasteful
|
|
3. **Automated via cron** - Set it and forget it
|
|
4. **Handles failures gracefully** - Continues even if one API is down
|
|
5. **Logs everything** - Easy to monitor and debug
|
|
|
|
### **Quick Setup:**
|
|
|
|
```bash
|
|
# On Proxmox
|
|
crontab -e
|
|
# Add: 0 7 * * * /home/poteapp/pote/scripts/daily_fetch.sh
|
|
|
|
# Test it
|
|
./scripts/daily_fetch.sh
|
|
|
|
# Check logs
|
|
tail -f logs/daily_fetch_*.log
|
|
```
|
|
|
|
### **Data Freshness Expectations:**
|
|
|
|
- **Best case:** Trades from yesterday (if official filed overnight)
|
|
- **Typical:** Trades from 30-45 days ago
|
|
- **Worst case:** Official filed late or hasn't filed yet
|
|
|
|
**This is normal and expected** - you're working with disclosure data, not market data.
|
|
|
|
|