Features Added: ============== 📧 EMAIL REPORTING SYSTEM: - EmailReporter: Send reports via SMTP (Gmail, SendGrid, custom) - ReportGenerator: Generate daily/weekly summaries with HTML/text formatting - Configurable via .env (SMTP_HOST, SMTP_PORT, etc.) - Scripts: send_daily_report.py, send_weekly_report.py 🤖 AUTOMATED RUNS: - automated_daily_run.sh: Full daily ETL pipeline + reporting - automated_weekly_run.sh: Weekly pattern analysis + reports - setup_cron.sh: Interactive cron job setup (5-minute setup) - Logs saved to ~/logs/ with automatic cleanup 🔍 HEALTH CHECKS: - health_check.py: System health monitoring - Checks: DB connection, data freshness, counts, recent alerts - JSON output for programmatic use - Exit codes for monitoring integration 🚀 CI/CD PIPELINE: - .github/workflows/ci.yml: Full CI/CD pipeline - GitHub Actions / Gitea Actions compatible - Jobs: lint & test, security scan, dependency scan, Docker build - PostgreSQL service for integration tests - 93 tests passing in CI 📚 COMPREHENSIVE DOCUMENTATION: - AUTOMATION_QUICKSTART.md: 5-minute email setup guide - docs/12_automation_and_reporting.md: Full automation guide - Updated README.md with automation links - Deployment → Production workflow guide 🛠️ IMPROVEMENTS: - All shell scripts made executable - Environment variable examples in .env.example - Report logs saved with timestamps - 30-day log retention with auto-cleanup - Health checks can be scheduled via cron WHAT THIS ENABLES: ================== After deployment, users can: 1. Set up automated daily/weekly email reports (5 min) 2. Receive HTML+text emails with: - New trades, market alerts, suspicious timing - Weekly patterns, rankings, repeat offenders 3. Monitor system health automatically 4. Run full CI/CD pipeline on every commit 5. Deploy with confidence (tests + security scans) USAGE: ====== # One-time setup (on deployed server) ./scripts/setup_cron.sh # Or manually send reports python scripts/send_daily_report.py --to user@example.com python scripts/send_weekly_report.py --to user@example.com # Check system health python scripts/health_check.py See AUTOMATION_QUICKSTART.md for full instructions. 93 tests passing | Full CI/CD | Email reports ready
231 lines
5.0 KiB
Markdown
231 lines
5.0 KiB
Markdown
# Data Updates & Maintenance
|
|
|
|
## Adding More Representatives
|
|
|
|
### Method 1: Manual Entry (Python Script)
|
|
|
|
```bash
|
|
# Edit the script to add your representatives
|
|
nano scripts/add_custom_trades.py
|
|
|
|
# Run it
|
|
python scripts/add_custom_trades.py
|
|
```
|
|
|
|
Example:
|
|
```python
|
|
add_trade(
|
|
session,
|
|
official_name="Your Representative",
|
|
party="Democrat", # or "Republican", "Independent"
|
|
chamber="House", # or "Senate"
|
|
state="CA",
|
|
ticker="NVDA",
|
|
company_name="NVIDIA Corporation",
|
|
side="buy", # or "sell"
|
|
value_min=15001,
|
|
value_max=50000,
|
|
transaction_date="2024-12-01",
|
|
disclosure_date="2024-12-15",
|
|
)
|
|
```
|
|
|
|
### Method 2: CSV Import
|
|
|
|
```bash
|
|
# Create a template
|
|
python scripts/scrape_alternative_sources.py template
|
|
|
|
# Edit trades_template.csv with your data
|
|
nano trades_template.csv
|
|
|
|
# Import it
|
|
python scripts/scrape_alternative_sources.py import trades_template.csv
|
|
```
|
|
|
|
CSV format:
|
|
```csv
|
|
name,party,chamber,state,district,ticker,side,value_min,value_max,transaction_date,disclosure_date
|
|
Bernie Sanders,Independent,Senate,VT,,COIN,sell,15001,50000,2024-12-01,2024-12-15
|
|
```
|
|
|
|
### Method 3: Automatic Updates (When API is available)
|
|
|
|
```bash
|
|
# Fetch latest trades
|
|
python scripts/fetch_congressional_trades.py --days 30
|
|
```
|
|
|
|
## Setting Up Automatic Updates
|
|
|
|
### Option A: Cron Job (Recommended)
|
|
|
|
```bash
|
|
# Make script executable
|
|
chmod +x ~/pote/scripts/daily_update.sh
|
|
|
|
# Add to cron (runs daily at 6 AM)
|
|
crontab -e
|
|
|
|
# Add this line:
|
|
0 6 * * * /home/poteapp/pote/scripts/daily_update.sh
|
|
|
|
# Or for testing (runs every hour):
|
|
0 * * * * /home/poteapp/pote/scripts/daily_update.sh
|
|
```
|
|
|
|
View logs:
|
|
```bash
|
|
ls -lh ~/logs/daily_update_*.log
|
|
tail -f ~/logs/daily_update_$(date +%Y%m%d).log
|
|
```
|
|
|
|
### Option B: Systemd Timer
|
|
|
|
Create `/etc/systemd/system/pote-update.service`:
|
|
```ini
|
|
[Unit]
|
|
Description=POTE Daily Data Update
|
|
After=network.target postgresql.service
|
|
|
|
[Service]
|
|
Type=oneshot
|
|
User=poteapp
|
|
WorkingDirectory=/home/poteapp/pote
|
|
ExecStart=/home/poteapp/pote/scripts/daily_update.sh
|
|
StandardOutput=append:/home/poteapp/logs/pote-update.log
|
|
StandardError=append:/home/poteapp/logs/pote-update.log
|
|
```
|
|
|
|
Create `/etc/systemd/system/pote-update.timer`:
|
|
```ini
|
|
[Unit]
|
|
Description=Run POTE update daily
|
|
Requires=pote-update.service
|
|
|
|
[Timer]
|
|
OnCalendar=daily
|
|
OnCalendar=06:00
|
|
Persistent=true
|
|
|
|
[Install]
|
|
WantedBy=timers.target
|
|
```
|
|
|
|
Enable it:
|
|
```bash
|
|
sudo systemctl enable --now pote-update.timer
|
|
sudo systemctl status pote-update.timer
|
|
```
|
|
|
|
## Manual Update Workflow
|
|
|
|
```bash
|
|
# 1. Fetch new trades (when API works)
|
|
python scripts/fetch_congressional_trades.py
|
|
|
|
# 2. Enrich new securities
|
|
python scripts/enrich_securities.py
|
|
|
|
# 3. Update prices
|
|
python scripts/fetch_sample_prices.py
|
|
|
|
# 4. Check status
|
|
~/status.sh
|
|
```
|
|
|
|
## Data Sources
|
|
|
|
### Currently Working:
|
|
- ✅ yfinance (prices, company info)
|
|
- ✅ Manual entry
|
|
- ✅ CSV import
|
|
- ✅ Fixture files (testing)
|
|
|
|
### Currently Down:
|
|
- ❌ House Stock Watcher API (domain issues)
|
|
|
|
### Future Options:
|
|
- QuiverQuant (requires $30/month subscription)
|
|
- Senate Stock Watcher (check if available)
|
|
- Capitol Trades (web scraping)
|
|
- Financial Modeling Prep (requires API key)
|
|
|
|
## Monitoring Updates
|
|
|
|
### Check Recent Activity
|
|
|
|
```python
|
|
from sqlalchemy import text
|
|
from pote.db import engine
|
|
from datetime import datetime, timedelta
|
|
|
|
with engine.connect() as conn:
|
|
# Trades added in last 7 days
|
|
week_ago = (datetime.now() - timedelta(days=7)).strftime('%Y-%m-%d')
|
|
result = conn.execute(text(f"""
|
|
SELECT o.name, s.ticker, t.side, t.transaction_date
|
|
FROM trades t
|
|
JOIN officials o ON t.official_id = o.id
|
|
JOIN securities s ON t.security_id = s.id
|
|
WHERE t.created_at >= '{week_ago}'
|
|
ORDER BY t.created_at DESC
|
|
"""))
|
|
|
|
print("Recent trades:")
|
|
for row in result:
|
|
print(f" {row.name} {row.side} {row.ticker} on {row.transaction_date}")
|
|
```
|
|
|
|
### Database Growth
|
|
|
|
```bash
|
|
# Track database size over time
|
|
psql -h localhost -U poteuser -d pote -c "
|
|
SELECT
|
|
pg_size_pretty(pg_database_size('pote')) as db_size,
|
|
(SELECT COUNT(*) FROM officials) as officials,
|
|
(SELECT COUNT(*) FROM trades) as trades,
|
|
(SELECT COUNT(*) FROM prices) as prices;
|
|
"
|
|
```
|
|
|
|
## Backup Before Updates
|
|
|
|
```bash
|
|
# Backup before major updates
|
|
pg_dump -h localhost -U poteuser pote > ~/backups/pote_$(date +%Y%m%d_%H%M%S).sql
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### API Not Working
|
|
- Use manual entry or CSV import
|
|
- Check if alternative sources are available
|
|
- Wait for House Stock Watcher to come back online
|
|
|
|
### Duplicate Trades
|
|
The system automatically deduplicates by:
|
|
- `source` + `external_id` (for API data)
|
|
- Official + Security + Transaction Date (for manual data)
|
|
|
|
### Missing Company Info
|
|
```bash
|
|
# Re-enrich all securities
|
|
python scripts/enrich_securities.py --force
|
|
```
|
|
|
|
### Price Data Gaps
|
|
```bash
|
|
# Fetch specific date range
|
|
python << 'EOF'
|
|
from pote.ingestion.prices import PriceLoader
|
|
from pote.db import get_session
|
|
|
|
loader = PriceLoader(next(get_session()))
|
|
loader.fetch_and_store_prices("NVDA", "2024-01-01", "2024-12-31")
|
|
EOF
|
|
```
|
|
|
|
|