Some checks failed
CI / Linting (Biome) (push) Failing after 36s
CI / Tests (push) Successful in 5m54s
CI / Type Check (adzuna-extractor) (push) Successful in 1m6s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m9s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m5s
CI / Type Check (orchestrator) (push) Successful in 1m21s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m4s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m4s
CI / Documentation (push) Successful in 1m52s
Adds extractor packages: arbeitnow, ashby, careerjet, fourdayweek,
greenhouse, himalayas, jobicy, jooble, lever, reed, remoteok, remotive,
themuse, usajobs, weworkremotely, workday — each with manifest, package
metadata and README.
Pipeline / shared:
- shared/job-fingerprint: stable hash for cross-source dedup, with tests
- discover-jobs: dedup via fingerprint and richer per-source merging
- jobs repository: fingerprint-aware upsert / lookup
- settings-registry, settings types/routes, demo-defaults: knobs for the
new sources
- shared extractors index: register the new manifests
- location-support, profiles route: small fixes for the new sources
Tooling:
- scripts/smoke-extractors.ts to sanity-check each source locally
- scripts/jobber-cron-{cherepaha,dobkin}.env.example: per-host cron
templates (CHANGEME placeholders only)
- .env.example: documented env vars for the new extractors
- .gitignore: ignore extractors/*/storage/ runtime caches (was ukvisajobs only)
Co-authored-by: Cursor <cursoragent@cursor.com>
213 lines
11 KiB
Plaintext
213 lines
11 KiB
Plaintext
# =============================================================================
|
|
# Job Ops - Environment Variables
|
|
# Copy this file to .env and fill in your values
|
|
# =============================================================================
|
|
|
|
MODEL=google/gemini-3-flash-preview
|
|
|
|
# Self-hosted Ollama (e.g. 16GB GPU): use a 22B-class model for scoring/tailoring; pull the tag on the server first.
|
|
# MODEL=mistral-small:22b
|
|
# LLM_PROVIDER=ollama
|
|
# LLM_BASE_URL=http://127.0.0.1:11434
|
|
# Heavier option (~may offload layers to CPU on 16GB): qwen2.5:32b
|
|
|
|
# DEPRECATED (auto-copied to LLM_API_KEY for compatibility)
|
|
# OPENROUTER_API_KEY=your_openrouter_api_key_here
|
|
|
|
# Self-hosted RxResume base URL, e.g., http://rxresume.local.net
|
|
# Defaults to https://v4.rxresu.me
|
|
# RXRESUME_URL=
|
|
|
|
# Optional: load resume JSON from disk instead of the RxResume API (scoring, tailoring, cover letters).
|
|
# Path is absolute or relative to the orchestrator process cwd (often `orchestrator/` when using `npm run dev` there).
|
|
# Takes precedence over Settings → local path. PDF export still uses RxResume when enabled.
|
|
# Example (monorepo): hand-authored v5 JSON may live under `data/resumes/` (that folder is gitignored by default).
|
|
# If you use seeded search profiles with `resumeLocalPath` + login auto-activate, leave this unset so Settings → local path wins.
|
|
# JOBOPS_LOCAL_RESUME_PATH=../data/resumes/ilia-dobkin.json
|
|
|
|
# RXResume credentials for PDF generation
|
|
# Create an account at: https://v4.rxresu.me
|
|
RXRESUME_EMAIL=your_email@example.com
|
|
RXRESUME_PASSWORD=your_password_here
|
|
|
|
# Optional: Basic Auth for write access
|
|
# the app is fully unauthenticated if this isn't set, which is the default
|
|
# When set, all write actions (POST/PATCH/DELETE) require Basic Auth.
|
|
# Optional second user (e.g. paired with a second search profile / `basicAuthUser` in profile JSON):
|
|
# BASIC_AUTH_USER_2=
|
|
# BASIC_AUTH_PASSWORD_2=
|
|
# Example local pairing with DB-seeded profiles (change passwords before exposing the UI):
|
|
# BASIC_AUTH_USER=ilia
|
|
# BASIC_AUTH_PASSWORD=changeme-ilia
|
|
# BASIC_AUTH_USER_2=cherepaha
|
|
# BASIC_AUTH_PASSWORD_2=changeme-cherepaha
|
|
BASIC_AUTH_USER=
|
|
BASIC_AUTH_PASSWORD=
|
|
|
|
# Optional: client build only — skip RxResume steps in the onboarding wizard (search without PDF export).
|
|
# Prefer setting `JOBOPS_LOCAL_RESUME_PATH` above: the API tells the UI to skip RxResume onboarding automatically.
|
|
# Otherwise: copy `orchestrator/.env.example` → `orchestrator/.env` and set VITE_SKIP_RXRESUME_ONBOARDING=true
|
|
# (Vite only reads `orchestrator/.env`, not this root file.)
|
|
# Docker: Vite vars need IMAGE BUILD time (Dockerfile ARG / docker-compose build args), not runtime .env.
|
|
# VITE_SKIP_RXRESUME_ONBOARDING=true
|
|
|
|
# Public base URL used to generate tracer links when PDFs are created by
|
|
# background/pipeline runs (where request host cannot be inferred).
|
|
# Example: JOBOPS_PUBLIC_BASE_URL=https://jobops.example.com
|
|
JOBOPS_PUBLIC_BASE_URL=
|
|
|
|
# =============================================================================
|
|
# Gmail OAuth (Tracking Inbox) - optional
|
|
# =============================================================================
|
|
# Required to connect Gmail from the UI.
|
|
GMAIL_OAUTH_CLIENT_ID=
|
|
GMAIL_OAUTH_CLIENT_SECRET=
|
|
|
|
# Optional override for OAuth callback URL.
|
|
# If unset, defaults to <request-origin>/oauth/gmail/callback
|
|
# GMAIL_OAUTH_REDIRECT_URI=http://localhost:3005/oauth/gmail/callback
|
|
|
|
# =============================================================================
|
|
# UKVisaJobs (UK visa sponsorship jobs) - optional
|
|
# =============================================================================
|
|
# Provide email/password for automatic login and token refresh.
|
|
# See extractors/ukvisajobs/README.md for detailed instructions.
|
|
UKVISAJOBS_EMAIL=
|
|
UKVISAJOBS_PASSWORD=
|
|
UKVISAJOBS_HEADLESS=true
|
|
|
|
# =============================================================================
|
|
# Adzuna (multi-country API source) - optional
|
|
# =============================================================================
|
|
# Register at https://developer.adzuna.com/admin/access_details
|
|
ADZUNA_APP_ID=
|
|
ADZUNA_APP_KEY=
|
|
# Default cap per search term (orchestrator run budget / settings can override).
|
|
# ADZUNA_MAX_JOBS_PER_TERM=50
|
|
# API page size (Adzuna max 50).
|
|
# ADZUNA_RESULTS_PER_PAGE=50
|
|
# Optional global `where` text for Adzuna. Pipeline runs usually use Settings → search cities
|
|
# instead; leave unset unless you want a fixed location for standalone extractor use.
|
|
# ADZUNA_LOCATION_QUERY=
|
|
# Only for running the extractor CLI alone; the pipeline sets country from your run (us / ca / gb / …).
|
|
# ADZUNA_COUNTRY=gb
|
|
|
|
# =============================================================================
|
|
# JobSpy - Job search configuration
|
|
# =============================================================================
|
|
# Filter for remote-only jobs (default: 0 = disabled)
|
|
# JOBSPY_IS_REMOTE=0
|
|
|
|
# =============================================================================
|
|
# USAJOBS API (US federal jobs) - optional, US-only
|
|
# =============================================================================
|
|
# Register at https://developer.usajobs.gov/APIRequest/Index
|
|
# USAJOBS requires a User-Agent that is a real contact email (per their TOS).
|
|
# Leave unset to disable the source.
|
|
# USAJOBS_API_KEY=
|
|
# USAJOBS_USER_AGENT=you@example.com
|
|
# USAJOBS_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# Jobicy (remote jobs feed) - optional, no auth
|
|
# =============================================================================
|
|
# Public JSON endpoint, capped at 50 results per call.
|
|
# JOBICY_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# The Muse (jobs API) - optional, API key recommended
|
|
# =============================================================================
|
|
# https://www.themuse.com/developers/api/v2 — works without a key but is
|
|
# heavily rate-limited. Set THEMUSE_API_KEY for higher quotas.
|
|
# THEMUSE_API_KEY=
|
|
# THEMUSE_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# Jooble (aggregator API) - optional
|
|
# =============================================================================
|
|
# Sign up at https://jooble.org/api/about for an API key.
|
|
# JOOBLE_API_KEY=
|
|
# JOOBLE_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# Careerjet (publisher API v4) - optional
|
|
# =============================================================================
|
|
# Register at https://www.careerjet.com/partners/api/ — declare API key + server IP(s).
|
|
# CAREERJET_AFFID=your_api_key
|
|
# CAREERJET_REFERER=https://your-site.com/path-to-job-search/
|
|
# CAREERJET_USER_IP=203.0.113.1
|
|
# Optional override for the required user_agent query param:
|
|
# CAREERJET_USER_AGENT=Mozilla/5.0 ...
|
|
# CAREERJET_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# Reed.co.uk (UK jobs API) - optional, UK-only
|
|
# =============================================================================
|
|
# Register at https://www.reed.co.uk/developers/jobseeker for an API key.
|
|
# REED_API_KEY=
|
|
# REED_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# Remote OK (remote jobs feed) - optional, no auth
|
|
# =============================================================================
|
|
# Public single-shot JSON feed at https://remoteok.com/api. We filter
|
|
# client-side by your search terms (matched against position + tags).
|
|
# Per Remote OK's TOS, link back to the original posting URLs when republishing.
|
|
# REMOTEOK_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# Remotive (remote jobs feed) - optional, no auth
|
|
# =============================================================================
|
|
# Public JSON API at https://remotive.com/api/remote-jobs?limit=N&search=term.
|
|
# Each search term is sent as the `search` parameter.
|
|
# REMOTIVE_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# Arbeitnow (multi-ATS aggregator) - optional, no auth
|
|
# =============================================================================
|
|
# Public JSON API at https://www.arbeitnow.com/api/job-board-api?page=N.
|
|
# Aggregates from Greenhouse, SmartRecruiters, Join, TeamTailor, Recruitee,
|
|
# and Comeet. No server-side search; filtering is done client-side.
|
|
# ARBEITNOW_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# Himalayas (remote jobs feed) - optional, no auth
|
|
# =============================================================================
|
|
# Public JSON API at https://himalayas.app/jobs/api?limit=N&offset=M.
|
|
# No server-side search; filtering is done client-side by title + categories.
|
|
# HIMALAYAS_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# We Work Remotely (RSS feed) - optional, no auth
|
|
# =============================================================================
|
|
# Public RSS at https://weworkremotely.com/remote-jobs.rss (all categories).
|
|
# Single fetch; filtering is done client-side by title + skills + category.
|
|
# WEWORKREMOTELY_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# 4 Day Week (reduced-schedule jobs) - optional, no auth
|
|
# =============================================================================
|
|
# Public JSON API at https://4dayweek.io/api/jobs?page=N.
|
|
# Paginated; filtering is done client-side by title + tech stack.
|
|
# No job description in listings; links to 4dayweek.io for details.
|
|
# FOURDAYWEEK_MAX_JOBS_PER_TERM=100
|
|
|
|
# =============================================================================
|
|
# Public ATS sources (Lever / Ashby / Greenhouse) - optional
|
|
# =============================================================================
|
|
# Comma- or newline-separated company slugs. The slug is the path segment used
|
|
# in each provider's public job board, e.g. `lever.co/some-company` → "some-company".
|
|
# LEVER_COMPANIES=netflix,figma
|
|
# ASHBY_COMPANIES=ramp,linear
|
|
# GREENHOUSE_COMPANIES=stripe,airbnb
|
|
|
|
# =============================================================================
|
|
# Workday (public career sites) - optional
|
|
# =============================================================================
|
|
# Newline- or comma-separated entries. Each entry is either:
|
|
# 1) A career-site URL we'll auto-parse, e.g.
|
|
# https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite
|
|
# 2) A JSON object with explicit fields:
|
|
# {"company":"NVIDIA","tenantUrl":"https://nvidia.wd5.myworkdayjobs.com","tenant":"nvidia","site":"NVIDIAExternalCareerSite","locale":"en-US"}
|
|
# WORKDAY_TENANTS=
|