Hiring cafe extractor (#192)

* feat(hiringcafe): register new source across shared/server/client enums * feat(hiringcafe-extractor): add browser-backed Hiring Cafe dataset extractor * feat(orchestrator): integrate Hiring Cafe discovery service into pipeline * feat(orchestrator-ui): add Hiring Cafe to source availability and run estimates * chore(hiringcafe): wire CI/docker and add extractor documentation * chore(format): apply biome formatting for Hiring Cafe integration * add original websites * coomints * number or null
2026-02-19 12:51:55 +00:00 · 2026-02-19 12:51:55 +00:00 · d34a9f041b
commit d34a9f041b
parent 16dd17ebea
31 changed files with 1363 additions and 5 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -52,6 +52,7 @@ jobs:
        project:
          - orchestrator
          - adzuna-extractor
+          - hiringcafe-extractor
          - gradcracker-extractor
          - ukvisajobs-extractor
    steps:
--- a/4
+++ b/4
@ -36,6 +36,7 @@ COPY docs-site/package*.json ./docs-site/
 COPY shared/package*.json ./shared/
 COPY orchestrator/package*.json ./orchestrator/
 COPY extractors/adzuna/package*.json ./extractors/adzuna/
+COPY extractors/hiringcafe/package*.json ./extractors/hiringcafe/
 COPY extractors/gradcracker/package*.json ./extractors/gradcracker/
 COPY extractors/ukvisajobs/package*.json ./extractors/ukvisajobs/

@ -54,6 +55,7 @@ COPY shared ./shared
 COPY docs-site ./docs-site
 COPY orchestrator ./orchestrator
 COPY extractors/adzuna ./extractors/adzuna
+COPY extractors/hiringcafe ./extractors/hiringcafe
 COPY extractors/gradcracker ./extractors/gradcracker
 COPY extractors/jobspy ./extractors/jobspy
 COPY extractors/ukvisajobs ./extractors/ukvisajobs
@ -100,6 +102,7 @@ COPY docs-site/package*.json ./docs-site/
 COPY shared/package*.json ./shared/
 COPY orchestrator/package*.json ./orchestrator/
 COPY extractors/adzuna/package*.json ./extractors/adzuna/
+COPY extractors/hiringcafe/package*.json ./extractors/hiringcafe/
 COPY extractors/gradcracker/package*.json ./extractors/gradcracker/
 COPY extractors/ukvisajobs/package*.json ./extractors/ukvisajobs/

@ -114,6 +117,7 @@ COPY --from=builder /app/docs-site/build ./orchestrator/dist/docs
 COPY shared ./shared
 COPY orchestrator ./orchestrator
 COPY extractors/adzuna ./extractors/adzuna
+COPY extractors/hiringcafe ./extractors/hiringcafe
 COPY extractors/gradcracker ./extractors/gradcracker
 COPY extractors/jobspy ./extractors/jobspy
 COPY extractors/ukvisajobs ./extractors/ukvisajobs
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -47,6 +47,9 @@ services:
        - path: ./extractors/gradcracker/src
          target: /app/extractors/gradcracker/src
          action: sync+restart
+        - path: ./extractors/hiringcafe/src
+          target: /app/extractors/hiringcafe/src
+          action: sync+restart
        - path: ./extractors/ukvisajobs/src
          target: /app/extractors/ukvisajobs/src
          action: sync+restart
--- a/docs-site/docs/extractors/adzuna.md
+++ b/docs-site/docs/extractors/adzuna.md
@ -7,6 +7,8 @@ sidebar_position: 6

 ## What it is

+Original website: [adzuna.com](https://www.adzuna.com)
+
 Adzuna is an API-backed extractor implemented in two lean pieces:

 1. `extractors/adzuna/src/main.ts` fetches paginated Adzuna search results and writes `jobs.json`.
--- a/docs-site/docs/extractors/gradcracker.md
+++ b/docs-site/docs/extractors/gradcracker.md
@ -7,6 +7,8 @@ sidebar_position: 2

 A plain-English walkthrough of the Gradcracker extractor in `extractors/gradcracker`.

+Original website: [gradcracker.com](https://www.gradcracker.com)
+
 ## Big picture

 The crawler builds search URLs, scrapes listing pages, then opens job details for descriptions and apply URLs.
--- a/docs-site/docs/extractors/hiring-cafe.md
+++ b/docs-site/docs/extractors/hiring-cafe.md
@ -0,0 +1,74 @@
+---
+id: hiring-cafe
+title: Hiring Cafe Extractor
+description: Browser-backed Hiring Cafe extraction integrated into the pipeline source selector.
+sidebar_position: 7
+---
+
+## What it is
+
+Original website: [hiring.cafe](https://hiring.cafe)
+
+Special thanks: Initial implementation inspiration came from [umur957/hiring-cafe-job-scraper](https://github.com/umur957/hiring-cafe-job-scraper).
+
+Hiring Cafe is a browser-backed extractor that queries Hiring Cafe search APIs and maps results into the orchestrator `CreateJobInput` shape.
+
+Implementation split:
+
+1. `extractors/hiringcafe/src/main.ts` builds search state, calls Hiring Cafe APIs, and writes dataset JSON.
+2. `orchestrator/src/server/services/hiring-cafe.ts` runs the extractor, streams progress events, and maps rows for pipeline import.
+
+## Why it exists
+
+Hiring Cafe adds another non-credentialed source that can be enabled from the existing source picker, without adding new settings UI.
+
+It also supports term-by-term search and country-aware search state using the same pipeline knobs you already set for automatic runs.
+
+## How to use it
+
+1. Open **Run jobs** and choose **Automatic**.
+2. **Hiring Cafe** is enabled by default in **Sources** (toggle it off if you do not want it for this run).
+3. Set your existing automatic run knobs:
+   - `searchTerms` drive per-term Hiring Cafe `searchQuery`.
+   - selected country maps into Hiring Cafe location search state.
+   - run budget path (`jobspyResultsWanted`) is reused as the max jobs-per-term cap.
+4. Start the run and watch progress in the pipeline progress card.
+
+Defaults and constraints:
+
+- No new Hiring Cafe settings fields were added.
+- `worldwide` and `usa/ca` run in broad mode without a strict country location filter.
+- Hiring Cafe is enabled by default in source selection.
+- `HIRING_CAFE_DATE_FETCHED_PAST_N_DAYS` controls recency window when running extractor directly (default `7`).
+
+Local run example:
+
+```bash
+HIRING_CAFE_SEARCH_TERMS='["backend engineer"]' \
+HIRING_CAFE_COUNTRY='united kingdom' \
+HIRING_CAFE_MAX_JOBS_PER_TERM='50' \
+npm --workspace hiringcafe-extractor run start
+```
+
+## Common problems
+
+### Hiring Cafe returns 429 / Vercel security checkpoint
+
+- The extractor first attempts Camoufox-backed Firefox and falls back to vanilla Firefox startup if Camoufox is unstable locally.
+- If upstream blocks continue, retry later or reduce run concurrency at the pipeline level by selecting fewer sources.
+
+### Hiring Cafe does not appear in sources
+
+- Check that client is running on latest build containing the new source list.
+- Hiring Cafe is source-only and does not require credentials, so it should appear once the new build is loaded.
+
+### Results are lower than expected
+
+- Cap is tied to automatic run budget path (`jobspyResultsWanted`) and search term count.
+- Country mapping can narrow results when a strict country location is applied.
+
+## Related pages
+
+- [Extractors Overview](/docs/next/extractors/overview)
+- [Pipeline Run](/docs/next/features/pipeline-run)
+- [Settings](/docs/next/features/settings)
--- a/docs-site/docs/extractors/jobspy.md
+++ b/docs-site/docs/extractors/jobspy.md
@ -7,6 +7,11 @@ sidebar_position: 3

 A walkthrough of the JobSpy extractor for Indeed, LinkedIn, and Glassdoor.

+Original websites:
+- [indeed.com](https://www.indeed.com)
+- [linkedin.com/jobs](https://www.linkedin.com/jobs)
+- [glassdoor.com](https://www.glassdoor.com)
+
 ## Big picture

 JobSpy runs as a Python script per search term, writes JSON, then orchestrator ingests and normalizes into internal job shape.
--- a/docs-site/docs/extractors/overview.md
+++ b/docs-site/docs/extractors/overview.md
@ -14,6 +14,7 @@ This page helps you choose the right extractor for your run, understand key cons
 | [Gradcracker](/docs/next/extractors/gradcracker) | UK graduate roles from Gradcracker | Crawling stability depends on page structure and anti-bot behavior; tuned for low concurrency | `GRADCRACKER_SEARCH_TERMS`, `GRADCRACKER_MAX_JOBS_PER_TERM`, `JOBOPS_SKIP_APPLY_FOR_EXISTING` | Scrapes listing metadata, then detail pages and apply URL resolution |
 | [JobSpy](/docs/next/extractors/jobspy) | Multi-source discovery (Indeed, LinkedIn, Glassdoor) | Requires Python wrapper execution per term; source availability and quality vary by site/location | `JOBSPY_SITES`, `JOBSPY_SEARCH_TERMS`, `JOBSPY_RESULTS_WANTED`, `JOBSPY_HOURS_OLD`, `JOBSPY_LINKEDIN_FETCH_DESCRIPTION` | Produces JSON per term, then orchestrator normalizes and de-duplicates by `jobUrl` |
 | [Adzuna](/docs/next/extractors/adzuna) | API-based multi-country discovery with low scraping overhead | Requires valid App ID/App Key; country must be in Adzuna-supported list | `ADZUNA_APP_ID`, `ADZUNA_APP_KEY`, `ADZUNA_MAX_JOBS_PER_TERM` | API pagination to dataset output; orchestrator maps progress and de-duplicates by `sourceJobId`/`jobUrl` |
+| [Hiring Cafe](/docs/next/extractors/hiring-cafe) | Browser-backed discovery using Hiring Cafe search APIs | Subject to upstream anti-bot checks; uses browser context and encoded search-state payloads | `HIRING_CAFE_SEARCH_TERMS`, `HIRING_CAFE_COUNTRY`, `HIRING_CAFE_MAX_JOBS_PER_TERM`, `HIRING_CAFE_DATE_FETCHED_PAST_N_DAYS` | Uses existing pipeline term/country/budget knobs and maps directly to normalized jobs |
 | [UKVisaJobs](/docs/next/extractors/ukvisajobs) | UK visa sponsorship-focused roles | Requires authenticated session and periodic token/cookie refresh | `UKVISAJOBS_EMAIL`, `UKVISAJOBS_PASSWORD`, `UKVISAJOBS_MAX_JOBS`, `UKVISAJOBS_SEARCH_KEYWORD` | API pagination + dataset output; orchestrator de-dupes and may fetch missing descriptions |
 | [Manual Import](/docs/next/extractors/manual) | One-off jobs not covered by scrapers | Inference quality depends on model/provider and input quality; some URLs cannot be fetched reliably | App/API endpoints (`/api/manual-jobs/infer`, `/api/manual-jobs/import`) | Accepts text/HTML/URL, runs inference, then saves and scores job after review |

@ -21,6 +22,7 @@ This page helps you choose the right extractor for your run, understand key cons

 - Use **JobSpy** for broad first-pass sourcing across common boards.
 - Use **Adzuna** when you want API-first discovery in supported non-UK markets.
+- Use **Hiring Cafe** when you want another term/country-driven source without adding credentials.
 - Use **Gradcracker** when targeting graduate pipelines in the UK.
 - Use **UKVisaJobs** for sponsorship-specific UK searches.
 - Use **Manual Import** when you already have a specific posting and need direct import.
@ -32,5 +34,6 @@ Many runs combine sources: broad discovery first, then manual import for high-pr
 - [Gradcracker](/docs/next/extractors/gradcracker)
 - [JobSpy](/docs/next/extractors/jobspy)
 - [Adzuna](/docs/next/extractors/adzuna)
+- [Hiring Cafe](/docs/next/extractors/hiring-cafe)
 - [UKVisaJobs](/docs/next/extractors/ukvisajobs)
 - [Manual Import](/docs/next/extractors/manual)
--- a/docs-site/docs/extractors/ukvisajobs.md
+++ b/docs-site/docs/extractors/ukvisajobs.md
@ -7,6 +7,8 @@ sidebar_position: 5

 UKVisaJobs is the most complex extractor because authenticated sessions are required.

+Original website: [my.ukvisajobs.com](https://my.ukvisajobs.com)
+
 ## Big picture

 Two layers:
--- a/docs-site/sidebars.ts
+++ b/docs-site/sidebars.ts
@ -45,6 +45,7 @@ const sidebars: SidebarsConfig = {
        "extractors/gradcracker",
        "extractors/jobspy",
        "extractors/adzuna",
+        "extractors/hiring-cafe",
        "extractors/manual",
        "extractors/ukvisajobs",
      ],
--- a/extractors/hiringcafe/README.md
+++ b/extractors/hiringcafe/README.md
@ -0,0 +1,20 @@
+# Hiring Cafe Extractor
+
+Browser-backed extractor for Hiring Cafe search APIs.
+
+Special thanks: initial implementation inspiration came from [umur957/hiring-cafe-job-scraper](https://github.com/umur957/hiring-cafe-job-scraper).
+
+## Environment
+
+- `HIRING_CAFE_SEARCH_TERMS` (JSON array or `|` / comma / newline-delimited)
+- `HIRING_CAFE_COUNTRY` (default: `united kingdom`)
+- `HIRING_CAFE_MAX_JOBS_PER_TERM` (default: `200`)
+- `HIRING_CAFE_DATE_FETCHED_PAST_N_DAYS` (default: `7`)
+- `HIRING_CAFE_OUTPUT_JSON` (default: `storage/datasets/default/jobs.json`)
+- `JOBOPS_EMIT_PROGRESS=1` to emit `JOBOPS_PROGRESS` events
+- `HIRING_CAFE_HEADLESS=false` to run headed
+
+## Notes
+
+- The extractor uses `s = base64(url-encoded JSON search state)`.
+- `worldwide` and `usa/ca` are treated as broad search modes without hard country location filters.
--- a/extractors/hiringcafe/package.json
+++ b/extractors/hiringcafe/package.json
@ -0,0 +1,26 @@
+{
+  "name": "hiringcafe-extractor",
+  "version": "0.0.1",
+  "type": "module",
+  "description": "Hiring Cafe extractor - fetches jobs via browser-backed API requests",
+  "main": "src/main.ts",
+  "dependencies": {
+    "camoufox-js": "^0.8.0",
+    "job-ops-shared": "^1.0.0",
+    "playwright": "^1.57.0",
+    "tsx": "^4.4.0"
+  },
+  "devDependencies": {
+    "@types/node": "^24.0.0",
+    "typescript": "~5.9.0"
+  },
+  "optionalDependencies": {
+    "impit-linux-x64-gnu": "^0.1.0"
+  },
+  "scripts": {
+    "start": "tsx src/main.ts",
+    "start:dev": "tsx src/main.ts",
+    "check:types": "tsc --noEmit",
+    "get-binaries": "camoufox-js fetch"
+  }
+}
--- a/extractors/hiringcafe/src/country-map.ts
+++ b/extractors/hiringcafe/src/country-map.ts
@ -0,0 +1,118 @@
+export function normalizeCountryKey(value: string | null | undefined): string {
+  const normalized = value?.trim().toLowerCase() ?? "";
+  if (normalized === "uk") return "united kingdom";
+  if (normalized === "us" || normalized === "usa") return "united states";
+  if (normalized === "türkiye") return "turkey";
+  if (normalized === "czech republic") return "czechia";
+  return normalized;
+}
+
+export interface HiringCafeCountryLocation {
+  formatted_address: string;
+  types: ["country"];
+  id: "user_country";
+  address_components: Array<{
+    long_name: string;
+    short_name: string;
+    types: ["country"];
+  }>;
+  options: {
+    flexible_regions: ["anywhere_in_continent", "anywhere_in_world"];
+  };
+}
+
+const GLOBAL_SEARCH_KEYS = new Set(["worldwide", "usa/ca"]);
+
+const COUNTRY_NAME_OVERRIDES: Record<string, string> = {
+  "united states": "United States",
+  "united kingdom": "United Kingdom",
+  "united arab emirates": "United Arab Emirates",
+  "new zealand": "New Zealand",
+  "south korea": "South Korea",
+  "south africa": "South Africa",
+  "costa rica": "Costa Rica",
+  "saudi arabia": "Saudi Arabia",
+  "hong kong": "Hong Kong",
+  czechia: "Czechia",
+  türkiye: "Turkey",
+  turkey: "Turkey",
+};
+
+const ISO2_ALIASES: Record<string, string> = {
+  "united states": "US",
+  "united kingdom": "GB",
+  "united arab emirates": "AE",
+  "new zealand": "NZ",
+  "south korea": "KR",
+  "south africa": "ZA",
+  "costa rica": "CR",
+  "saudi arabia": "SA",
+  "hong kong": "HK",
+  czechia: "CZ",
+  türkiye: "TR",
+  turkey: "TR",
+};
+
+const regionNameMap = buildRegionNameMap();
+
+function buildRegionNameMap(): Map<string, string> {
+  const names = new Intl.DisplayNames(["en"], { type: "region" });
+  const map = new Map<string, string>();
+
+  for (let i = 65; i <= 90; i += 1) {
+    for (let j = 65; j <= 90; j += 1) {
+      const iso2 = String.fromCharCode(i, j);
+      const displayName = names.of(iso2);
+      if (!displayName || displayName === iso2) continue;
+      map.set(normalizeCountryKey(displayName), iso2);
+    }
+  }
+
+  return map;
+}
+
+function toCountryLabel(countryKey: string): string {
+  const override = COUNTRY_NAME_OVERRIDES[countryKey];
+  if (override) return override;
+  return countryKey.replace(/\b\w/g, (char) => char.toUpperCase());
+}
+
+function toIso2(countryKey: string): string | null {
+  if (ISO2_ALIASES[countryKey]) {
+    return ISO2_ALIASES[countryKey];
+  }
+  return regionNameMap.get(countryKey) ?? null;
+}
+
+export function shouldUseGlobalLocation(countryInput?: string | null): boolean {
+  const countryKey = normalizeCountryKey(countryInput);
+  return !countryKey || GLOBAL_SEARCH_KEYS.has(countryKey);
+}
+
+export function resolveHiringCafeCountryLocation(
+  countryInput?: string | null,
+): HiringCafeCountryLocation | null {
+  const countryKey = normalizeCountryKey(countryInput);
+  if (!countryKey || GLOBAL_SEARCH_KEYS.has(countryKey)) return null;
+
+  const iso2 = toIso2(countryKey);
+  if (!iso2) return null;
+
+  const longName = toCountryLabel(countryKey);
+
+  return {
+    formatted_address: longName,
+    types: ["country"],
+    id: "user_country",
+    address_components: [
+      {
+        long_name: longName,
+        short_name: iso2,
+        types: ["country"],
+      },
+    ],
+    options: {
+      flexible_regions: ["anywhere_in_continent", "anywhere_in_world"],
+    },
+  };
+}
--- a/extractors/hiringcafe/src/default-search-state.ts
+++ b/extractors/hiringcafe/src/default-search-state.ts
@ -0,0 +1,91 @@
+import type { HiringCafeCountryLocation } from "./country-map.js";
+
+export interface HiringCafeSearchState {
+  locations: HiringCafeCountryLocation[];
+  workplaceTypes: Array<"Remote" | "Hybrid" | "Onsite">;
+  defaultToUserLocation: boolean;
+  userLocation: null;
+  commitmentTypes: string[];
+  seniorityLevel: string[];
+  roleTypes: string[];
+  roleYoeRange: [number, number];
+  excludeIfRoleYoeIsNotSpecified: boolean;
+  managementYoeRange: [number, number];
+  excludeIfManagementYoeIsNotSpecified: boolean;
+  securityClearances: string[];
+  searchQuery: string;
+  dateFetchedPastNDays: number;
+  hiddenCompanies: string[];
+  sortBy: "default";
+  companyPublicOrPrivate: "all";
+  latestInvestmentYearRange: [null, null];
+  latestInvestmentSeries: string[];
+  latestInvestmentAmount: null;
+  latestInvestmentCurrency: string[];
+  investors: string[];
+  excludedInvestors: string[];
+  isNonProfit: "all";
+  companySizeRanges: string[];
+  minYearFounded: null;
+  maxYearFounded: null;
+  excludedLatestInvestmentSeries: string[];
+}
+
+export function createDefaultSearchState(args: {
+  searchQuery: string;
+  location: HiringCafeCountryLocation | null;
+  dateFetchedPastNDays: number;
+}): HiringCafeSearchState {
+  return {
+    locations: args.location ? [args.location] : [],
+    workplaceTypes: ["Remote", "Hybrid", "Onsite"],
+    defaultToUserLocation: false,
+    userLocation: null,
+    commitmentTypes: [
+      "Full Time",
+      "Part Time",
+      "Contract",
+      "Internship",
+      "Temporary",
+      "Seasonal",
+      "Volunteer",
+    ],
+    seniorityLevel: [
+      "No Prior Experience Required",
+      "Entry Level",
+      "Mid Level",
+      "Senior Level",
+    ],
+    roleTypes: ["Individual Contributor", "People Manager"],
+    roleYoeRange: [0, 20],
+    excludeIfRoleYoeIsNotSpecified: false,
+    managementYoeRange: [0, 20],
+    excludeIfManagementYoeIsNotSpecified: false,
+    securityClearances: [
+      "None",
+      "Confidential",
+      "Secret",
+      "Top Secret",
+      "Top Secret/SCI",
+      "Public Trust",
+      "Interim Clearances",
+      "Other",
+    ],
+    searchQuery: args.searchQuery,
+    dateFetchedPastNDays: args.dateFetchedPastNDays,
+    hiddenCompanies: [],
+    sortBy: "default",
+    companyPublicOrPrivate: "all",
+    latestInvestmentYearRange: [null, null],
+    latestInvestmentSeries: [],
+    latestInvestmentAmount: null,
+    latestInvestmentCurrency: [],
+    investors: [],
+    excludedInvestors: [],
+    isNonProfit: "all",
+    companySizeRanges: [],
+    minYearFounded: null,
+    maxYearFounded: null,
+    excludedLatestInvestmentSeries: [],
+  };
+}
--- a/extractors/hiringcafe/src/main.ts
+++ b/extractors/hiringcafe/src/main.ts
@ -0,0 +1,439 @@
+import { mkdir, writeFile } from "node:fs/promises";
+import { dirname, join } from "node:path";
+import { fileURLToPath } from "node:url";
+import { launchOptions } from "camoufox-js";
+import {
+  toNumberOrNull,
+  toStringOrNull,
+} from "job-ops-shared/utils/type-conversion";
+import { firefox, type Page } from "playwright";
+import {
+  normalizeCountryKey,
+  resolveHiringCafeCountryLocation,
+} from "./country-map.js";
+import { createDefaultSearchState } from "./default-search-state.js";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const BASE_URL = "https://hiring.cafe";
+const JOBOPS_PROGRESS_PREFIX = "JOBOPS_PROGRESS ";
+const DEFAULT_MAX_JOBS_PER_TERM = 200;
+const DEFAULT_SEARCH_TERM = "web developer";
+const DEFAULT_DATE_FETCHED_PAST_N_DAYS = 30;
+const PAGE_LIMIT = 50;
+
+type RawHiringCafeJob = Record<string, unknown>;
+
+interface ExtractedJob {
+  source: "hiringcafe";
+  sourceJobId?: string;
+  title: string;
+  employer: string;
+  jobUrl: string;
+  applicationLink: string;
+  location?: string;
+  salary?: string;
+  datePosted?: string;
+  jobDescription?: string;
+  jobType?: string;
+}
+
+interface BrowserApiResponse {
+  ok: boolean;
+  status: number;
+  statusText: string;
+  data: unknown;
+  responseText: string;
+}
+
+function emitProgress(payload: Record<string, unknown>): void {
+  if (process.env.JOBOPS_EMIT_PROGRESS !== "1") return;
+  console.log(`${JOBOPS_PROGRESS_PREFIX}${JSON.stringify(payload)}`);
+}
+
+function parsePositiveInt(input: string | undefined, fallback: number): number {
+  const parsed = input ? Number.parseInt(input, 10) : Number.NaN;
+  if (!Number.isFinite(parsed) || parsed < 1) return fallback;
+  return parsed;
+}
+
+function parseSearchTerms(raw: string | undefined): string[] {
+  if (!raw || raw.trim().length === 0) return [DEFAULT_SEARCH_TERM];
+
+  const trimmed = raw.trim();
+  if (trimmed.startsWith("[")) {
+    try {
+      const parsed = JSON.parse(trimmed) as unknown;
+      if (Array.isArray(parsed)) {
+        const terms = parsed
+          .map((value) => toStringOrNull(value))
+          .filter((value): value is string => Boolean(value));
+        if (terms.length > 0) return terms;
+      }
+    } catch {
+      // Fall through to delimiter parsing.
+    }
+  }
+
+  const delimiter = trimmed.includes("|")
+    ? "|"
+    : trimmed.includes("\n")
+      ? "\n"
+      : ",";
+
+  const terms = trimmed
+    .split(delimiter)
+    .map((value) => value.trim())
+    .filter(Boolean);
+
+  return terms.length > 0 ? terms : [DEFAULT_SEARCH_TERM];
+}
+
+function encodeSearchState(searchState: unknown): string {
+  const json = JSON.stringify(searchState);
+  const urlEncodedJson = encodeURIComponent(json);
+  return Buffer.from(urlEncodedJson, "utf-8").toString("base64");
+}
+
+function asRecord(value: unknown): Record<string, unknown> | null {
+  if (!value || typeof value !== "object" || Array.isArray(value)) return null;
+  return value as Record<string, unknown>;
+}
+
+function asStringArray(value: unknown): string[] {
+  if (!Array.isArray(value)) return [];
+  return value
+    .map((item) => toStringOrNull(item))
+    .filter((item): item is string => Boolean(item));
+}
+
+function firstArrayValue(value: unknown): string | null {
+  const values = asStringArray(value);
+  return values.length > 0 ? values[0] : null;
+}
+
+function formatCompensation(
+  processedJobData: Record<string, unknown> | null,
+): string | undefined {
+  if (!processedJobData) return undefined;
+
+  const min = toNumberOrNull(processedJobData.yearly_min_compensation);
+  const max = toNumberOrNull(processedJobData.yearly_max_compensation);
+  if (min === null && max === null) return undefined;
+
+  const currency = toStringOrNull(
+    processedJobData.listed_compensation_currency,
+  );
+  const frequency =
+    toStringOrNull(processedJobData.listed_compensation_frequency) ?? "Yearly";
+
+  const amount =
+    min !== null && max !== null
+      ? `${Math.round(min)}-${Math.round(max)}`
+      : min !== null
+        ? `${Math.round(min)}+`
+        : `${Math.round(max ?? 0)}`;
+
+  const parts = [currency, amount, frequency ? `/ ${frequency}` : ""]
+    .filter(Boolean)
+    .join(" ")
+    .trim();
+
+  return parts || undefined;
+}
+
+function mapHiringCafeJob(raw: RawHiringCafeJob): ExtractedJob | null {
+  const jobInformation = asRecord(raw.job_information);
+  const processed = asRecord(raw.v5_processed_job_data);
+  const companyInfo = asRecord(jobInformation?.company_info);
+
+  const sourceJobId =
+    toStringOrNull(raw.id) ??
+    toStringOrNull(raw.objectID) ??
+    toStringOrNull(raw.original_source_id) ??
+    toStringOrNull(raw.requisition_id) ??
+    undefined;
+
+  const jobUrl = toStringOrNull(raw.apply_url);
+  if (!jobUrl) return null;
+
+  const title =
+    toStringOrNull(jobInformation?.title) ??
+    toStringOrNull(jobInformation?.job_title_raw) ??
+    toStringOrNull(processed?.core_job_title) ??
+    "Unknown Title";
+
+  const employer =
+    toStringOrNull(companyInfo?.name) ??
+    toStringOrNull(processed?.company_name) ??
+    "Unknown Employer";
+
+  const location =
+    toStringOrNull(processed?.formatted_workplace_location) ??
+    firstArrayValue(processed?.workplace_cities) ??
+    firstArrayValue(processed?.workplace_states) ??
+    firstArrayValue(processed?.workplace_countries) ??
+    undefined;
+
+  const commitments = asStringArray(processed?.commitment);
+  const jobType = commitments.length > 0 ? commitments.join(", ") : undefined;
+
+  return {
+    source: "hiringcafe",
+    sourceJobId,
+    title,
+    employer,
+    jobUrl,
+    applicationLink: jobUrl,
+    location,
+    salary: formatCompensation(processed),
+    datePosted: toStringOrNull(processed?.estimated_publish_date) ?? undefined,
+    jobDescription: toStringOrNull(jobInformation?.description) ?? undefined,
+    jobType,
+  };
+}
+
+function extractResultsBatch(payload: unknown): RawHiringCafeJob[] {
+  if (Array.isArray(payload)) {
+    return payload.filter(
+      (item): item is RawHiringCafeJob =>
+        Boolean(item) && typeof item === "object" && !Array.isArray(item),
+    );
+  }
+
+  const payloadRecord = asRecord(payload);
+  const results = payloadRecord?.results;
+  if (!Array.isArray(results)) return [];
+
+  return results.filter(
+    (item): item is RawHiringCafeJob =>
+      Boolean(item) && typeof item === "object" && !Array.isArray(item),
+  );
+}
+
+function parseTotalCount(payload: unknown): number | null {
+  const payloadRecord = asRecord(payload);
+  if (!payloadRecord) return null;
+  return toNumberOrNull(payloadRecord.total);
+}
+
+async function callHiringCafeApi(
+  page: Page,
+  endpoint: string,
+  params: Record<string, string>,
+): Promise<unknown> {
+  const response = await page.evaluate(
+    async ({ endpointArg, paramsArg }) => {
+      const url = new URL(endpointArg, window.location.origin);
+      for (const [key, value] of Object.entries(paramsArg)) {
+        url.searchParams.set(key, value);
+      }
+
+      const res = await fetch(url.toString(), {
+        method: "GET",
+        credentials: "include",
+        headers: {
+          Accept: "application/json, text/plain, */*",
+        },
+      });
+
+      const text = await res.text();
+      let data: unknown = null;
+      try {
+        data = JSON.parse(text);
+      } catch {
+        // Keep response text for diagnostics.
+      }
+
+      const output: BrowserApiResponse = {
+        ok: res.ok,
+        status: res.status,
+        statusText: res.statusText,
+        data,
+        responseText: text,
+      };
+
+      return output;
+    },
+    { endpointArg: endpoint, paramsArg: params },
+  );
+
+  const result = response as BrowserApiResponse;
+
+  if (!result.ok) {
+    const snippet = result.responseText.slice(0, 250);
+    throw new Error(
+      `Hiring Cafe API ${endpoint} failed (${result.status} ${result.statusText}): ${snippet}`,
+    );
+  }
+
+  if (result.data === null) {
+    const snippet = result.responseText.slice(0, 250);
+    throw new Error(
+      `Hiring Cafe API ${endpoint} returned non-JSON response: ${snippet}`,
+    );
+  }
+
+  return result.data;
+}
+
+async function run(): Promise<void> {
+  const searchTerms = parseSearchTerms(process.env.HIRING_CAFE_SEARCH_TERMS);
+  const country = normalizeCountryKey(
+    process.env.HIRING_CAFE_COUNTRY ?? "united kingdom",
+  );
+  const maxJobsPerTerm = parsePositiveInt(
+    process.env.HIRING_CAFE_MAX_JOBS_PER_TERM,
+    DEFAULT_MAX_JOBS_PER_TERM,
+  );
+  const dateFetchedPastNDays = parsePositiveInt(
+    process.env.HIRING_CAFE_DATE_FETCHED_PAST_N_DAYS,
+    DEFAULT_DATE_FETCHED_PAST_N_DAYS,
+  );
+  const outputPath =
+    process.env.HIRING_CAFE_OUTPUT_JSON ||
+    join(__dirname, "../storage/datasets/default/jobs.json");
+  const headless = process.env.HIRING_CAFE_HEADLESS !== "false";
+
+  let browser = await firefox.launch(
+    await launchOptions({
+      headless,
+      humanize: true,
+      geoip: true,
+    }),
+  );
+  let context = await browser.newContext();
+  let page = await context.newPage();
+
+  const allJobs: ExtractedJob[] = [];
+  const seen = new Set<string>();
+
+  try {
+    const initializePage = async () => {
+      await page.goto(BASE_URL, {
+        waitUntil: "domcontentloaded",
+        timeout: 60_000,
+      });
+      await page.waitForTimeout(2_000);
+    };
+
+    try {
+      await initializePage();
+    } catch (error) {
+      const message = error instanceof Error ? error.message : String(error);
+      console.warn(
+        `Camoufox browser startup was unstable, retrying with vanilla Firefox: ${message}`,
+      );
+      await browser.close();
+      browser = await firefox.launch({ headless });
+      context = await browser.newContext();
+      page = await context.newPage();
+      await initializePage();
+    }
+
+    for (let i = 0; i < searchTerms.length; i += 1) {
+      const searchTerm = searchTerms[i];
+      const termIndex = i + 1;
+
+      emitProgress({
+        event: "term_start",
+        termIndex,
+        termTotal: searchTerms.length,
+        searchTerm,
+      });
+
+      const location = resolveHiringCafeCountryLocation(country);
+      const searchState = createDefaultSearchState({
+        searchQuery: searchTerm,
+        location,
+        dateFetchedPastNDays,
+      });
+      const encodedSearchState = encodeSearchState(searchState);
+
+      let totalAvailable: number | null = null;
+      try {
+        const countPayload = await callHiringCafeApi(
+          page,
+          "/api/search-jobs/get-total-count",
+          {
+            s: encodedSearchState,
+          },
+        );
+        totalAvailable = parseTotalCount(countPayload);
+      } catch (error) {
+        const message = error instanceof Error ? error.message : String(error);
+        console.warn(
+          `Hiring Cafe count request failed for term '${searchTerm}': ${message}`,
+        );
+      }
+
+      const termTarget =
+        totalAvailable !== null
+          ? Math.min(maxJobsPerTerm, totalAvailable)
+          : maxJobsPerTerm;
+
+      let pageNo = 0;
+      let termCollected = 0;
+
+      while (termCollected < termTarget && pageNo < PAGE_LIMIT) {
+        const size = Math.min(1000, termTarget - termCollected);
+        const jobsPayload = await callHiringCafeApi(page, "/api/search-jobs", {
+          size: String(size),
+          page: String(pageNo),
+          s: encodedSearchState,
+        });
+
+        const batch = extractResultsBatch(jobsPayload);
+        if (batch.length === 0) break;
+
+        let mappedOnPage = 0;
+        for (const rawJob of batch) {
+          if (termCollected >= termTarget) break;
+          const mapped = mapHiringCafeJob(rawJob);
+          if (!mapped) continue;
+
+          const dedupeKey = mapped.sourceJobId || mapped.jobUrl;
+          if (seen.has(dedupeKey)) continue;
+          seen.add(dedupeKey);
+
+          allJobs.push(mapped);
+          termCollected += 1;
+          mappedOnPage += 1;
+        }
+
+        emitProgress({
+          event: "page_fetched",
+          termIndex,
+          termTotal: searchTerms.length,
+          searchTerm,
+          pageNo,
+          resultsOnPage: mappedOnPage,
+          totalCollected: termCollected,
+        });
+
+        if (batch.length < size) break;
+        pageNo += 1;
+      }
+
+      emitProgress({
+        event: "term_complete",
+        termIndex,
+        termTotal: searchTerms.length,
+        searchTerm,
+        jobsFoundTerm: termCollected,
+      });
+    }
+  } finally {
+    await browser.close();
+  }
+
+  await mkdir(dirname(outputPath), { recursive: true });
+  await writeFile(outputPath, `${JSON.stringify(allJobs, null, 2)}\n`, "utf-8");
+
+  console.log(`Hiring Cafe extractor wrote ${allJobs.length} jobs`);
+}
+
+run().catch((error: unknown) => {
+  const message = error instanceof Error ? error.message : "Unknown error";
+  console.error(`Hiring Cafe extractor failed: ${message}`);
+  process.exitCode = 1;
+});
--- a/extractors/hiringcafe/tsconfig.json
+++ b/extractors/hiringcafe/tsconfig.json
@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "module": "NodeNext",
+    "moduleResolution": "NodeNext",
+    "target": "ES2022",
+    "outDir": "dist",
+    "strict": true,
+    "noUnusedLocals": false,
+    "lib": ["ES2022", "DOM"],
+    "types": ["node"]
+  },
+  "include": ["./src/**/*"]
+}
--- a/orchestrator/src/client/components/PipelineProgress.tsx
+++ b/orchestrator/src/client/components/PipelineProgress.tsx
@ -24,7 +24,13 @@ interface PipelineProgress {
    | "failed";
  message: string;
  detail?: string;
-  crawlingSource: "gradcracker" | "jobspy" | "ukvisajobs" | "adzuna" | null;
+  crawlingSource:
+    | "gradcracker"
+    | "jobspy"
+    | "ukvisajobs"
+    | "adzuna"
+    | "hiringcafe"
+    | null;
  crawlingSourcesCompleted: number;
  crawlingSourcesTotal: number;
  crawlingTermsProcessed: number;
@ -85,6 +91,7 @@ const sourceLabel: Record<
  jobspy: "JobSpy",
  ukvisajobs: "UKVisaJobs",
  adzuna: "Adzuna",
+  hiringcafe: "Hiring Cafe",
 };

 const clamp = (value: number, min: number, max: number) =>
--- a/orchestrator/src/client/pages/orchestrator/automatic-run.test.ts
+++ b/orchestrator/src/client/pages/orchestrator/automatic-run.test.ts
@ -92,4 +92,20 @@ describe("automatic-run utilities", () => {
    expect(estimate.discovered.cap).toBeGreaterThan(0);
    expect(estimate.discovered.cap).toBeLessThanOrEqual(120);
  });
+
+  it("includes hiringcafe in estimate caps using the shared term budget", () => {
+    const estimate = calculateAutomaticEstimate({
+      values: {
+        topN: 10,
+        minSuitabilityScore: 50,
+        searchTerms: ["backend", "platform"],
+        runBudget: 120,
+        country: "united kingdom",
+      },
+      sources: ["hiringcafe"],
+    });
+
+    expect(estimate.discovered.cap).toBeGreaterThan(0);
+    expect(estimate.discovered.cap).toBeLessThanOrEqual(120);
+  });
 });
--- a/orchestrator/src/client/pages/orchestrator/automatic-run.ts
+++ b/orchestrator/src/client/pages/orchestrator/automatic-run.ts
@ -77,6 +77,7 @@ export function deriveExtractorLimits(args: {
  const includesGradcracker = args.sources.includes("gradcracker");
  const includesUkVisaJobs = args.sources.includes("ukvisajobs");
  const includesAdzuna = args.sources.includes("adzuna");
+  const includesHiringCafe = args.sources.includes("hiringcafe");

  const weightedContributors =
    (includesIndeed ? termCount : 0) +
@ -84,7 +85,8 @@ export function deriveExtractorLimits(args: {
    (includesGlassdoor ? termCount : 0) +
    (includesGradcracker ? termCount : 0) +
    (includesUkVisaJobs ? 1 : 0) +
-    (includesAdzuna ? termCount : 0);
+    (includesAdzuna ? termCount : 0) +
+    (includesHiringCafe ? termCount : 0);

  if (weightedContributors <= 0) {
    return {
@ -143,6 +145,7 @@ export function calculateAutomaticEstimate(args: {
  const hasLinkedIn = sources.includes("linkedin");
  const hasGlassdoor = sources.includes("glassdoor");
  const hasAdzuna = sources.includes("adzuna");
+  const hasHiringCafe = sources.includes("hiringcafe");
  const limits = deriveExtractorLimits({
    budget: values.runBudget,
    searchTerms: values.searchTerms,
@ -158,8 +161,12 @@ export function calculateAutomaticEstimate(args: {
    : 0;
  const ukvisaCap = hasUkVisaJobs ? limits.ukvisajobsMaxJobs : 0;
  const adzunaCap = hasAdzuna ? limits.adzunaMaxJobsPerTerm * termCount : 0;
+  const hiringCafeCap = hasHiringCafe
+    ? limits.jobspyResultsWanted * termCount
+    : 0;

-  const discoveredCap = jobspyCap + gradcrackerCap + ukvisaCap + adzunaCap;
+  const discoveredCap =
+    jobspyCap + gradcrackerCap + ukvisaCap + adzunaCap + hiringCafeCap;
  const discoveredMin = Math.round(discoveredCap * 0.35);
  const discoveredMax = Math.round(discoveredCap * 0.75);
  const processedMin = Math.min(values.topN, discoveredMin);
--- a/orchestrator/src/client/pages/orchestrator/constants.ts
+++ b/orchestrator/src/client/pages/orchestrator/constants.ts
@ -14,6 +14,7 @@ export const orderedSources: JobSource[] = [
  "linkedin",
  "glassdoor",
  "adzuna",
+  "hiringcafe",
  "ukvisajobs",
 ];
 export const orderedFilterSources: JobSource[] = [...orderedSources, "manual"];
--- a/orchestrator/src/client/pages/orchestrator/utils.ts
+++ b/orchestrator/src/client/pages/orchestrator/utils.ts
@ -168,7 +168,8 @@ export const getSourcesWithJobs = (jobs: JobListItem[]): JobSource[] => {
 export const getEnabledSources = (
  settings: AppSettings | null,
 ): JobSource[] => {
-  if (!settings) return [...DEFAULT_PIPELINE_SOURCES, "glassdoor"];
+  if (!settings)
+    return [...DEFAULT_PIPELINE_SOURCES, "glassdoor", "hiringcafe"];

  const enabled: JobSource[] = [];
  const hasUkVisaJobsAuth = Boolean(
@ -191,6 +192,10 @@ export const getEnabledSources = (
      if (hasAdzunaAuth) enabled.push(source);
      continue;
    }
+    if (source === "hiringcafe") {
+      enabled.push(source);
+      continue;
+    }
    if (
      source === "indeed" ||
      source === "linkedin" ||
--- a/orchestrator/src/lib/utils.ts
+++ b/orchestrator/src/lib/utils.ts
@ -144,5 +144,6 @@ export const sourceLabel: Record<Job["source"], string> = {
  glassdoor: "Glassdoor",
  ukvisajobs: "UK Visa Jobs",
  adzuna: "Adzuna",
+  hiringcafe: "Hiring Cafe",
  manual: "Manual",
 };
--- a/orchestrator/src/server/api/routes/pipeline.ts
+++ b/orchestrator/src/server/api/routes/pipeline.ts
@ -101,6 +101,7 @@ const runPipelineSchema = z.object({
        "glassdoor",
        "ukvisajobs",
        "adzuna",
+        "hiringcafe",
      ]),
    )
    .min(1)
--- a/orchestrator/src/server/config/demo-defaults.data.ts
+++ b/orchestrator/src/server/config/demo-defaults.data.ts
@ -253,6 +253,7 @@ export const DEMO_SOURCE_BASE_URLS: Record<JobSource, string> = {
  gradcracker: "https://www.gradcracker.com",
  ukvisajobs: "https://www.ukvisajobs.com",
  adzuna: "https://www.adzuna.com",
+  hiringcafe: "https://hiring.cafe",
  manual: "https://example.com",
 };

--- a/orchestrator/src/server/db/schema.ts
+++ b/orchestrator/src/server/db/schema.ts
@ -40,6 +40,7 @@ export const jobs = sqliteTable("jobs", {
      "glassdoor",
      "ukvisajobs",
      "adzuna",
+      "hiringcafe",
      "manual",
    ],
  })
--- a/orchestrator/src/server/pipeline/progress.ts
+++ b/orchestrator/src/server/pipeline/progress.ts
@ -14,7 +14,12 @@ export type PipelineStep =
  | "cancelled"
  | "failed";

-export type CrawlSource = "gradcracker" | "jobspy" | "ukvisajobs" | "adzuna";
+export type CrawlSource =
+  | "gradcracker"
+  | "jobspy"
+  | "ukvisajobs"
+  | "adzuna"
+  | "hiringcafe";

 export interface PipelineProgress {
  step: PipelineStep;
--- a/orchestrator/src/server/pipeline/steps/discover-jobs.test.ts
+++ b/orchestrator/src/server/pipeline/steps/discover-jobs.test.ts
@ -23,6 +23,10 @@ vi.mock("../../services/adzuna", () => ({
  runAdzuna: vi.fn(),
 }));

+vi.mock("../../services/hiring-cafe", () => ({
+  runHiringCafe: vi.fn(),
+}));
+
 vi.mock("../../services/ukvisajobs", () => ({
  runUkVisaJobs: vi.fn(),
 }));
@ -218,6 +222,126 @@ describe("discoverJobsStep", () => {
    expect(vi.mocked(adzuna.runAdzuna)).not.toHaveBeenCalled();
  });

+  it("runs hiringcafe when selected and passes country/terms/cap", async () => {
+    const settingsRepo = await import("../../repositories/settings");
+    const hiringCafe = await import("../../services/hiring-cafe");
+
+    vi.mocked(settingsRepo.getAllSettings).mockResolvedValue({
+      searchTerms: JSON.stringify(["engineer"]),
+      jobspyCountryIndeed: "united states",
+      jobspyResultsWanted: "25",
+    } as any);
+
+    vi.mocked(hiringCafe.runHiringCafe).mockResolvedValue({
+      success: true,
+      jobs: [
+        {
+          source: "hiringcafe",
+          sourceJobId: "hc-1",
+          title: "Engineer",
+          employer: "ACME",
+          jobUrl: "https://example.com/hc",
+          applicationLink: "https://example.com/hc",
+        },
+      ],
+    } as any);
+
+    const result = await discoverJobsStep({
+      mergedConfig: {
+        ...config,
+        sources: ["hiringcafe"],
+      },
+    });
+
+    expect(result.discoveredJobs).toHaveLength(1);
+    expect(vi.mocked(hiringCafe.runHiringCafe)).toHaveBeenCalledWith(
+      expect.objectContaining({
+        country: "united states",
+        searchTerms: ["engineer"],
+        maxJobsPerTerm: 25,
+      }),
+    );
+  });
+
+  it("updates Hiring Cafe terms and pages via progress callbacks", async () => {
+    const settingsRepo = await import("../../repositories/settings");
+    const hiringCafe = await import("../../services/hiring-cafe");
+
+    vi.mocked(settingsRepo.getAllSettings).mockResolvedValue({
+      searchTerms: JSON.stringify(["engineer", "frontend"]),
+      jobspyCountryIndeed: "united kingdom",
+      jobspyResultsWanted: "50",
+    } as any);
+
+    vi.mocked(hiringCafe.runHiringCafe).mockImplementation(
+      async (options: any) => {
+        options?.onProgress?.({
+          type: "term_start",
+          termIndex: 1,
+          termTotal: 2,
+          searchTerm: "engineer",
+        });
+        options?.onProgress?.({
+          type: "page_fetched",
+          termIndex: 1,
+          termTotal: 2,
+          searchTerm: "engineer",
+          pageNo: 0,
+          resultsOnPage: 10,
+          totalCollected: 10,
+        });
+        options?.onProgress?.({
+          type: "term_complete",
+          termIndex: 1,
+          termTotal: 2,
+          searchTerm: "engineer",
+          jobsFoundTerm: 10,
+        });
+        return { success: true, jobs: [] } as any;
+      },
+    );
+
+    await discoverJobsStep({
+      mergedConfig: {
+        ...config,
+        sources: ["hiringcafe"],
+      },
+    });
+
+    const progress = getProgress();
+    expect(progress.crawlingTermsProcessed).toBe(1);
+    expect(progress.crawlingTermsTotal).toBe(2);
+    expect(progress.crawlingListPagesProcessed).toBe(1);
+    expect(progress.crawlingJobPagesEnqueued).toBe(10);
+    expect(progress.crawlingJobPagesProcessed).toBe(10);
+  });
+
+  it("returns Hiring Cafe source error when extractor fails", async () => {
+    const settingsRepo = await import("../../repositories/settings");
+    const hiringCafe = await import("../../services/hiring-cafe");
+
+    vi.mocked(settingsRepo.getAllSettings).mockResolvedValue({
+      searchTerms: JSON.stringify(["engineer"]),
+      jobspyCountryIndeed: "united kingdom",
+      jobspyResultsWanted: "50",
+    } as any);
+
+    vi.mocked(hiringCafe.runHiringCafe).mockResolvedValue({
+      success: false,
+      jobs: [],
+      error: "blocked upstream",
+    } as any);
+
+    await expect(
+      discoverJobsStep({
+        mergedConfig: {
+          ...config,
+          sources: ["hiringcafe"],
+        },
+      }),
+    ).rejects.toThrow("All sources failed: hiringcafe: blocked upstream");
+  });
+
  it("maps Gradcracker progress callback into live crawling counters", async () => {
    const settingsRepo = await import("../../repositories/settings");
    const crawler = await import("../../services/crawler");
@ -402,6 +526,7 @@ describe("discoverJobsStep", () => {
  it("does not throw when no sources are requested", async () => {
    const settingsRepo = await import("../../repositories/settings");
    const adzuna = await import("../../services/adzuna");
+    const hiringCafe = await import("../../services/hiring-cafe");
    const jobSpy = await import("../../services/jobspy");
    const crawler = await import("../../services/crawler");
    const ukVisa = await import("../../services/ukvisajobs");
@ -422,6 +547,7 @@ describe("discoverJobsStep", () => {
    expect(result.sourceErrors).toEqual([]);
    expect(vi.mocked(jobSpy.runJobSpy)).not.toHaveBeenCalled();
    expect(vi.mocked(adzuna.runAdzuna)).not.toHaveBeenCalled();
+    expect(vi.mocked(hiringCafe.runHiringCafe)).not.toHaveBeenCalled();
    expect(vi.mocked(crawler.runCrawler)).not.toHaveBeenCalled();
    expect(vi.mocked(ukVisa.runUkVisaJobs)).not.toHaveBeenCalled();
  });
--- a/orchestrator/src/server/pipeline/steps/discover-jobs.ts
+++ b/orchestrator/src/server/pipeline/steps/discover-jobs.ts
@ -10,6 +10,7 @@ import * as jobsRepo from "../../repositories/jobs";
 import * as settingsRepo from "../../repositories/settings";
 import { runAdzuna } from "../../services/adzuna";
 import { runCrawler } from "../../services/crawler";
+import { runHiringCafe } from "../../services/hiring-cafe";
 import { runJobSpy } from "../../services/jobspy";
 import { runUkVisaJobs } from "../../services/ukvisajobs";
 import { progressHelpers, updateProgress } from "../progress";
@ -75,12 +76,14 @@ export async function discoverJobsStep(args: {

  const shouldRunJobSpy = jobSpySites.length > 0;
  const shouldRunAdzuna = compatibleSources.includes("adzuna");
+  const shouldRunHiringCafe = compatibleSources.includes("hiringcafe");
  const shouldRunGradcracker = compatibleSources.includes("gradcracker");
  const shouldRunUkVisaJobs = compatibleSources.includes("ukvisajobs");

  const totalSources =
    Number(shouldRunJobSpy) +
    Number(shouldRunAdzuna) +
+    Number(shouldRunHiringCafe) +
    Number(shouldRunGradcracker) +
    Number(shouldRunUkVisaJobs);
  let completedSources = 0;
@ -236,6 +239,84 @@ export async function discoverJobsStep(args: {
    return { discoveredJobs, sourceErrors };
  }

+  if (shouldRunHiringCafe) {
+    progressHelpers.startSource("hiringcafe", completedSources, totalSources, {
+      termsTotal: searchTerms.length,
+      detail: "Hiring Cafe: fetching jobs...",
+    });
+
+    const hiringCafeMaxJobsPerTerm = settings.jobspyResultsWanted
+      ? parseInt(settings.jobspyResultsWanted, 10)
+      : 200;
+
+    const hiringCafeResult = await runHiringCafe({
+      country: selectedCountry,
+      searchTerms,
+      maxJobsPerTerm: hiringCafeMaxJobsPerTerm,
+      onProgress: (event) => {
+        if (event.type === "term_start") {
+          progressHelpers.crawlingUpdate({
+            source: "hiringcafe",
+            termsProcessed: Math.max(event.termIndex - 1, 0),
+            termsTotal: event.termTotal,
+            phase: "list",
+            currentUrl: event.searchTerm,
+          });
+          updateProgress({
+            step: "crawling",
+            detail: `Hiring Cafe: term ${event.termIndex}/${event.termTotal} (${event.searchTerm})`,
+          });
+          return;
+        }
+
+        if (event.type === "page_fetched") {
+          const displayPageNo = event.pageNo + 1;
+          progressHelpers.crawlingUpdate({
+            source: "hiringcafe",
+            termsProcessed: Math.max(event.termIndex - 1, 0),
+            termsTotal: event.termTotal,
+            listPagesProcessed: displayPageNo,
+            jobPagesEnqueued: event.totalCollected,
+            jobPagesProcessed: event.totalCollected,
+            phase: "list",
+            currentUrl: `page ${displayPageNo}`,
+          });
+          updateProgress({
+            step: "crawling",
+            detail: `Hiring Cafe: term ${event.termIndex}/${event.termTotal}, page ${displayPageNo} (${event.totalCollected} collected)`,
+          });
+          return;
+        }
+
+        progressHelpers.crawlingUpdate({
+          source: "hiringcafe",
+          termsProcessed: event.termIndex,
+          termsTotal: event.termTotal,
+          phase: "list",
+          currentUrl: event.searchTerm,
+        });
+        updateProgress({
+          step: "crawling",
+          detail: `Hiring Cafe: completed term ${event.termIndex}/${event.termTotal} (${event.searchTerm})`,
+        });
+      },
+    });
+
+    if (!hiringCafeResult.success) {
+      sourceErrors.push(
+        `hiringcafe: ${hiringCafeResult.error ?? "unknown error"}`,
+      );
+    } else {
+      discoveredJobs.push(...hiringCafeResult.jobs);
+    }
+
+    markSourceComplete();
+  }
+
+  if (args.shouldCancel?.()) {
+    return { discoveredJobs, sourceErrors };
+  }
+
  if (shouldRunGradcracker) {
    progressHelpers.startSource("gradcracker", completedSources, totalSources, {
      detail: "Gradcracker: scraping...",
--- a/orchestrator/src/server/services/hiring-cafe.ts
+++ b/orchestrator/src/server/services/hiring-cafe.ts
@ -0,0 +1,270 @@
+import { spawn, spawnSync } from "node:child_process";
+import { mkdir, readFile, rm } from "node:fs/promises";
+import { createRequire } from "node:module";
+import { dirname, join } from "node:path";
+import { createInterface } from "node:readline";
+import { fileURLToPath } from "node:url";
+import { logger } from "@infra/logger";
+import { sanitizeUnknown } from "@infra/sanitize";
+import type { CreateJobInput } from "@shared/types";
+import { toNumberOrNull, toStringOrNull } from "@shared/utils/type-conversion";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const HIRING_CAFE_DIR = join(__dirname, "../../../../extractors/hiringcafe");
+const DATASET_PATH = join(
+  HIRING_CAFE_DIR,
+  "storage/datasets/default/jobs.json",
+);
+const STORAGE_DATASET_DIR = join(HIRING_CAFE_DIR, "storage/datasets/default");
+const JOBOPS_PROGRESS_PREFIX = "JOBOPS_PROGRESS ";
+
+const require = createRequire(import.meta.url);
+const TSX_CLI_PATH = resolveTsxCliPath();
+
+type HiringCafeRawJob = Record<string, unknown>;
+
+export type HiringCafeProgressEvent =
+  | {
+      type: "term_start";
+      termIndex: number;
+      termTotal: number;
+      searchTerm: string;
+    }
+  | {
+      type: "page_fetched";
+      termIndex: number;
+      termTotal: number;
+      searchTerm: string;
+      pageNo: number;
+      resultsOnPage: number;
+      totalCollected: number;
+    }
+  | {
+      type: "term_complete";
+      termIndex: number;
+      termTotal: number;
+      searchTerm: string;
+      jobsFoundTerm: number;
+    };
+
+export interface RunHiringCafeOptions {
+  searchTerms?: string[];
+  country?: string;
+  maxJobsPerTerm?: number;
+  onProgress?: (event: HiringCafeProgressEvent) => void;
+}
+
+export interface HiringCafeResult {
+  success: boolean;
+  jobs: CreateJobInput[];
+  error?: string;
+}
+
+function resolveTsxCliPath(): string | null {
+  try {
+    return require.resolve("tsx/dist/cli.mjs");
+  } catch {
+    return null;
+  }
+}
+
+function canRunNpmCommand(): boolean {
+  const result = spawnSync("npm", ["--version"], { stdio: "ignore" });
+  return !result.error && result.status === 0;
+}
+
+function parseProgressLine(line: string): HiringCafeProgressEvent | null {
+  if (!line.startsWith(JOBOPS_PROGRESS_PREFIX)) return null;
+
+  const raw = line.slice(JOBOPS_PROGRESS_PREFIX.length).trim();
+
+  let parsed: Record<string, unknown>;
+  try {
+    parsed = JSON.parse(raw) as Record<string, unknown>;
+  } catch {
+    return null;
+  }
+
+  const event = toStringOrNull(parsed.event);
+  const termIndex = toNumberOrNull(parsed.termIndex);
+  const termTotal = toNumberOrNull(parsed.termTotal);
+  const searchTerm = toStringOrNull(parsed.searchTerm) ?? "";
+
+  if (!event || termIndex === null || termTotal === null) {
+    return null;
+  }
+
+  if (event === "term_start") {
+    return { type: "term_start", termIndex, termTotal, searchTerm };
+  }
+
+  if (event === "page_fetched") {
+    const pageNo = toNumberOrNull(parsed.pageNo);
+    if (pageNo === null) return null;
+
+    return {
+      type: "page_fetched",
+      termIndex,
+      termTotal,
+      searchTerm,
+      pageNo,
+      resultsOnPage: toNumberOrNull(parsed.resultsOnPage) ?? 0,
+      totalCollected: toNumberOrNull(parsed.totalCollected) ?? 0,
+    };
+  }
+
+  if (event === "term_complete") {
+    return {
+      type: "term_complete",
+      termIndex,
+      termTotal,
+      searchTerm,
+      jobsFoundTerm: toNumberOrNull(parsed.jobsFoundTerm) ?? 0,
+    };
+  }
+
+  return null;
+}
+
+function mapHiringCafeRow(row: HiringCafeRawJob): CreateJobInput | null {
+  const jobUrl = toStringOrNull(row.jobUrl);
+  if (!jobUrl) return null;
+
+  return {
+    source: "hiringcafe",
+    sourceJobId: toStringOrNull(row.sourceJobId) ?? undefined,
+    title: toStringOrNull(row.title) ?? "Unknown Title",
+    employer: toStringOrNull(row.employer) ?? "Unknown Employer",
+    jobUrl,
+    applicationLink: toStringOrNull(row.applicationLink) ?? jobUrl,
+    location: toStringOrNull(row.location) ?? undefined,
+    salary: toStringOrNull(row.salary) ?? undefined,
+    datePosted: toStringOrNull(row.datePosted) ?? undefined,
+    jobDescription: toStringOrNull(row.jobDescription) ?? undefined,
+    jobType: toStringOrNull(row.jobType) ?? undefined,
+  };
+}
+
+async function readDataset(): Promise<CreateJobInput[]> {
+  const content = await readFile(DATASET_PATH, "utf-8");
+  const parsed = JSON.parse(content) as unknown;
+  if (!Array.isArray(parsed)) return [];
+
+  const jobs: CreateJobInput[] = [];
+  const seen = new Set<string>();
+
+  for (const value of parsed) {
+    if (!value || typeof value !== "object" || Array.isArray(value)) continue;
+
+    const mapped = mapHiringCafeRow(value as HiringCafeRawJob);
+    if (!mapped) continue;
+
+    const dedupeKey = mapped.sourceJobId || mapped.jobUrl;
+    if (seen.has(dedupeKey)) continue;
+
+    seen.add(dedupeKey);
+    jobs.push(mapped);
+  }
+
+  return jobs;
+}
+
+async function clearStorageDataset(): Promise<void> {
+  await rm(STORAGE_DATASET_DIR, { recursive: true, force: true });
+  await mkdir(STORAGE_DATASET_DIR, { recursive: true });
+}
+
+export async function runHiringCafe(
+  options: RunHiringCafeOptions = {},
+): Promise<HiringCafeResult> {
+  const searchTerms =
+    options.searchTerms && options.searchTerms.length > 0
+      ? options.searchTerms
+      : ["web developer"];
+  const country = (options.country || "united kingdom").trim().toLowerCase();
+  const maxJobsPerTerm = options.maxJobsPerTerm ?? 200;
+
+  const useNpmCommand = canRunNpmCommand();
+  if (!useNpmCommand && !TSX_CLI_PATH) {
+    return {
+      success: false,
+      jobs: [],
+      error: "Unable to execute Hiring Cafe extractor (npm/tsx unavailable)",
+    };
+  }
+
+  try {
+    await clearStorageDataset();
+
+    await new Promise<void>((resolve, reject) => {
+      const extractorEnv = {
+        ...process.env,
+        JOBOPS_EMIT_PROGRESS: "1",
+        HIRING_CAFE_SEARCH_TERMS: JSON.stringify(searchTerms),
+        HIRING_CAFE_COUNTRY: country,
+        HIRING_CAFE_MAX_JOBS_PER_TERM: String(maxJobsPerTerm),
+        HIRING_CAFE_OUTPUT_JSON: DATASET_PATH,
+      };
+
+      const child = useNpmCommand
+        ? spawn("npm", ["run", "start"], {
+            cwd: HIRING_CAFE_DIR,
+            stdio: ["ignore", "pipe", "pipe"],
+            env: extractorEnv,
+          })
+        : (() => {
+            const tsxCliPath = TSX_CLI_PATH;
+            if (!tsxCliPath) {
+              throw new Error(
+                "Unable to execute Hiring Cafe extractor (npm/tsx unavailable)",
+              );
+            }
+
+            return spawn(process.execPath, [tsxCliPath, "src/main.ts"], {
+              cwd: HIRING_CAFE_DIR,
+              stdio: ["ignore", "pipe", "pipe"],
+              env: extractorEnv,
+            });
+          })();
+
+      const handleLine = (line: string, stream: NodeJS.WriteStream) => {
+        const progressEvent = parseProgressLine(line);
+        if (progressEvent) {
+          options.onProgress?.(progressEvent);
+          return;
+        }
+
+        stream.write(`${line}\n`);
+      };
+
+      const stdoutRl = child.stdout
+        ? createInterface({ input: child.stdout })
+        : null;
+      const stderrRl = child.stderr
+        ? createInterface({ input: child.stderr })
+        : null;
+
+      stdoutRl?.on("line", (line) => handleLine(line, process.stdout));
+      stderrRl?.on("line", (line) => handleLine(line, process.stderr));
+
+      child.on("close", (code) => {
+        stdoutRl?.close();
+        stderrRl?.close();
+        if (code === 0) resolve();
+        else
+          reject(new Error(`Hiring Cafe extractor exited with code ${code}`));
+      });
+      child.on("error", reject);
+    });
+
+    const jobs = await readDataset();
+    return { success: true, jobs };
+  } catch (error) {
+    const message = error instanceof Error ? error.message : "Unknown error";
+    logger.warn("Hiring Cafe extractor run failed", {
+      error: message,
+      details: sanitizeUnknown(error),
+    });
+    return { success: false, jobs: [], error: message };
+  }
+}
--- a/package-lock.json
+++ b/package-lock.json
@ -153,6 +153,33 @@
        "undici-types": "~7.16.0"
      }
    },
+    "extractors/hiringcafe": {
+      "name": "hiringcafe-extractor",
+      "version": "0.0.1",
+      "dependencies": {
+        "camoufox-js": "^0.8.0",
+        "job-ops-shared": "^1.0.0",
+        "playwright": "^1.57.0",
+        "tsx": "^4.4.0"
+      },
+      "devDependencies": {
+        "@types/node": "^24.0.0",
+        "typescript": "~5.9.0"
+      },
+      "optionalDependencies": {
+        "impit-linux-x64-gnu": "^0.1.0"
+      }
+    },
+    "extractors/hiringcafe/node_modules/@types/node": {
+      "version": "24.10.13",
+      "resolved": "https://registry.npmjs.org/@types/node/-/node-24.10.13.tgz",
+      "integrity": "sha512-oH72nZRfDv9lADUBSo104Aq7gPHpQZc4BTx38r9xf9pg5LfP6EzSyH2n7qFmmxRQXh7YlUXODcYsg6PuTDSxGg==",
+      "dev": true,
+      "license": "MIT",
+      "dependencies": {
+        "undici-types": "~7.16.0"
+      }
+    },
    "extractors/ukvisajobs": {
      "name": "ukvisajobs-extractor",
      "version": "0.0.1",
@ -13175,6 +13202,10 @@
        "node": ">=16.0.0"
      }
    },
+    "node_modules/hiringcafe-extractor": {
+      "resolved": "extractors/hiringcafe",
+      "link": true
+    },
    "node_modules/history": {
      "version": "4.10.1",
      "resolved": "https://registry.npmjs.org/history/-/history-4.10.1.tgz",
--- a/shared/src/types.ts
+++ b/shared/src/types.ts
@ -126,6 +126,7 @@ export type JobSource =
  | "glassdoor"
  | "ukvisajobs"
  | "adzuna"
+  | "hiringcafe"
  | "manual";

 export interface Job {