- PR1: Project scaffold, DB models, price loader - PR2: Congressional trade ingestion (House Stock Watcher) - PR3: Security enrichment + deployment infrastructure - 37 passing tests, 87%+ coverage - Docker + Proxmox deployment ready - Complete documentation - Works 100% offline with fixtures
2.4 KiB
2.4 KiB
Data sources (public) + limitations
POTE only uses lawfully available public data. This project is for private research and produces descriptive analytics (not investment advice).
Candidate sources (Phase 1)
U.S. Congress trading disclosures
- QuiverQuant (API): provides congressional trading data (availability depends on plan/keys).
- Financial Modeling Prep (FMP): provides endpoints related to congressional trading and other market metadata (availability depends on plan/keys).
- Official disclosure sources (future): House/Senate disclosure filings where accessible and lawful to process.
POTE will treat source data as “best effort” and store:
source(where it came from)source_trade_id(if provided)rawpayload snapshot (optional, for traceability)quality_flagsdescribing parse/coverage issues
Daily price data
- yfinance (Yahoo finance wrapper) for daily OHLCV (research use; subject to availability and terms).
- Alternative provider adapters can be added later (e.g., Stooq, AlphaVantage, Polygon, etc. as configured by the user).
Known limitations / pitfalls
Disclosure quality and ambiguity
- Tickers may be missing or wrong; some disclosures list company names only or broad funds.
- Transactions may be value ranges rather than exact amounts.
- Some entries may reflect family accounts or managed accounts depending on disclosure details.
- Duplicate records can occur across sources; deduplication is probabilistic when no unique ID exists.
Timing and “lag”
- Trades are often disclosed after the transaction date. Any analysis must account for:
- transaction date
- filing date
- disclosure lag (filing - transaction)
Survivorship / coverage
- Some data providers may have incomplete histories or change coverage over time.
- Price history may be missing for delisted tickers or corporate actions.
Interpretation risks
- Correlation is not causation; return outcomes do not imply intent or information access.
- High abnormal returns can occur by chance; small samples are especially noisy.
Source governance in this repo
- No scraping that violates terms or access controls.
- No bypassing paywalls, authentication, or restrictions.
- When adding a new source, document:
- endpoint/coverage
- required API keys / limits
- normalization mapping to the internal schema
- known quirks