ilia f5179304c1
Some checks failed
CI / Linting (Biome) (push) Failing after 41s
CI / Tests (push) Successful in 5m27s
CI / Type Check (adzuna-extractor) (push) Successful in 1m9s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m13s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m9s
CI / Type Check (orchestrator) (push) Successful in 1m24s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m8s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m9s
CI / Documentation (push) Successful in 1m59s
feat(discovery): blocked countries filter and smoke subprocess fixes
Add blockedCountries in Settings so pipeline discovery drops jobs whose
location mentions listed countries (existing discovered rows are kept).
Document the feature, fix smoke tsconfig inheritance for nested extractors,
and run smoke via an absolute-tsconfig wrapper.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 11:41:29 -04:00
..
2026-02-19 12:51:55 +00:00
2026-02-21 00:42:09 +00:00

Hiring Cafe Extractor

Browser-backed extractor for Hiring Cafe search APIs.

Special thanks: initial implementation inspiration came from umur957/hiring-cafe-job-scraper.

Environment

  • HIRING_CAFE_SEARCH_TERMS (JSON array or | / comma / newline-delimited)
  • HIRING_CAFE_COUNTRY (default: united kingdom)
  • HIRING_CAFE_MAX_JOBS_PER_TERM (default: 200)
  • HIRING_CAFE_DATE_FETCHED_PAST_N_DAYS (default: 7)
  • HIRING_CAFE_LOCATION_QUERY (optional city, e.g. Leeds)
  • HIRING_CAFE_LOCATION_RADIUS_MILES (default: 1 when city is set)
  • HIRING_CAFE_OUTPUT_JSON (default: storage/datasets/default/jobs.json)
  • JOBOPS_EMIT_PROGRESS=1 to emit JOBOPS_PROGRESS events
  • HIRING_CAFE_HEADLESS=false to run headed

Notes

  • The extractor uses s = base64(url-encoded JSON search state).
  • worldwide and usa/ca are treated as broad search modes without hard country location filters.
  • City geocoding uses Nominatim (OpenStreetMap data).