ilia f5179304c1
Some checks failed
CI / Linting (Biome) (push) Failing after 41s
CI / Tests (push) Successful in 5m27s
CI / Type Check (adzuna-extractor) (push) Successful in 1m9s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m13s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m9s
CI / Type Check (orchestrator) (push) Successful in 1m24s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m8s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m9s
CI / Documentation (push) Successful in 1m59s
feat(discovery): blocked countries filter and smoke subprocess fixes
Add blockedCountries in Settings so pipeline discovery drops jobs whose
location mentions listed countries (existing discovered rows are kept).
Document the feature, fix smoke tsconfig inheritance for nested extractors,
and run smoke via an absolute-tsconfig wrapper.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 11:41:29 -04:00
..
2025-12-26 20:17:05 +00:00
2026-03-10 15:46:57 +00:00
2026-01-07 23:53:01 +00:00

UK Visa Jobs Extractor

Fetches job listings from my.ukvisajobs.com that may sponsor work visas.

Setup

npm install

If Playwright browsers are skipped in your environment, install Firefox:

npx playwright install firefox

If Camoufox assets are missing, fetch them:

npx camoufox-js fetch

Configuration

Set the following environment variables:

Variable Description
UKVISAJOBS_EMAIL Login email for automatic token refresh
UKVISAJOBS_PASSWORD Login password for automatic token refresh
UKVISAJOBS_HEADLESS Set to false to show the browser (default: true)
UKVISAJOBS_MAX_JOBS Maximum jobs to fetch (default: 50, max: 200)
UKVISAJOBS_SEARCH_KEYWORD Optional search filter

Automatic login & cache

The extractor will:

  1. Launch a Camoufox (Playwright Firefox) browser and sign in
  2. Navigate to the open jobs page and capture the token/cookies
  3. Cache the session to storage/ukvisajobs-auth.json
  4. Reuse the cached values until the API reports an expired token, then refresh

Running

npm start

Output is written to storage/datasets/default/ as JSON files.