Shaheer Sarfaraz d34a9f041b
Hiring cafe extractor (#192)
* feat(hiringcafe): register new source across shared/server/client enums

* feat(hiringcafe-extractor): add browser-backed Hiring Cafe dataset extractor

* feat(orchestrator): integrate Hiring Cafe discovery service into pipeline

* feat(orchestrator-ui): add Hiring Cafe to source availability and run estimates

* chore(hiringcafe): wire CI/docker and add extractor documentation

* chore(format): apply biome formatting for Hiring Cafe integration

* add original websites

* coomints

* number or null
2026-02-19 12:51:55 +00:00

839 B

Hiring Cafe Extractor

Browser-backed extractor for Hiring Cafe search APIs.

Special thanks: initial implementation inspiration came from umur957/hiring-cafe-job-scraper.

Environment

  • HIRING_CAFE_SEARCH_TERMS (JSON array or | / comma / newline-delimited)
  • HIRING_CAFE_COUNTRY (default: united kingdom)
  • HIRING_CAFE_MAX_JOBS_PER_TERM (default: 200)
  • HIRING_CAFE_DATE_FETCHED_PAST_N_DAYS (default: 7)
  • HIRING_CAFE_OUTPUT_JSON (default: storage/datasets/default/jobs.json)
  • JOBOPS_EMIT_PROGRESS=1 to emit JOBOPS_PROGRESS events
  • HIRING_CAFE_HEADLESS=false to run headed

Notes

  • The extractor uses s = base64(url-encoded JSON search state).
  • worldwide and usa/ca are treated as broad search modes without hard country location filters.