- PR1: Project scaffold, DB models, price loader - PR2: Congressional trade ingestion (House Stock Watcher) - PR3: Security enrichment + deployment infrastructure - 37 passing tests, 87%+ coverage - Docker + Proxmox deployment ready - Complete documentation - Works 100% offline with fixtures
58 lines
2.2 KiB
Markdown
58 lines
2.2 KiB
Markdown
# Architecture (target shape for Phase 1)
|
|
|
|
This is an intentionally simple architecture optimized for **clarity, idempotency, and testability**.
|
|
|
|
## High-level flow
|
|
1. **Ingest disclosures** (public source API) → normalize → upsert to DB (`officials`, `securities`, `trades`)
|
|
2. **Load market data** (daily prices) → upsert to DB (`prices`)
|
|
3. **Compute metrics** (returns, benchmarks, aggregates) → write to DB (`metrics_trade`, `metrics_official`)
|
|
4. **Query/report** via CLI (later: read-only API/dashboard)
|
|
|
|
## Proposed module layout (to be created)
|
|
|
|
```
|
|
src/pote/
|
|
__init__.py
|
|
config.py # settings loader (.env), constants
|
|
db/
|
|
__init__.py
|
|
session.py # engine + sessionmaker
|
|
models.py # SQLAlchemy ORM models
|
|
migrations/ # Alembic (added once models stabilize)
|
|
clients/
|
|
__init__.py
|
|
quiver.py # QuiverQuant client (optional)
|
|
fmp.py # Financial Modeling Prep client (optional)
|
|
market_data.py # yfinance wrapper / other provider interface
|
|
etl/
|
|
__init__.py
|
|
congress_trades.py # disclosure ingestion + upsert
|
|
prices.py # price ingestion + upsert + caching
|
|
analytics/
|
|
__init__.py
|
|
returns.py # return & abnormal return calculations
|
|
signals.py # rule-based “flags” (transparent, caveated)
|
|
aggregations.py # per-official summaries
|
|
cli/
|
|
__init__.py
|
|
main.py # entrypoint for research queries
|
|
tests/
|
|
...
|
|
```
|
|
|
|
## Design constraints (non-negotiable)
|
|
- **Public data only**: every record must store `source` and enough IDs to trace back.
|
|
- **No advice**: outputs and docs must avoid prescriptive language and include disclaimers.
|
|
- **Idempotency**: ETL and metrics jobs must be safe to rerun.
|
|
- **Separation of concerns**:
|
|
- clients fetch raw data
|
|
- etl normalizes + writes
|
|
- analytics reads normalized data and writes derived tables
|
|
|
|
## Operational conventions
|
|
- Logging: structured-ish logs with counts (fetched/inserted/updated/skipped).
|
|
- Rate limits: conservative defaults; provide `--sleep`/`--max-requests` config as needed.
|
|
- Config: one settings object with env var support; `.env.example` committed, `.env` ignored.
|
|
|
|
|