Jobber/.env.example
ilia c840f289e1
Some checks failed
CI / Linting (Biome) (push) Failing after 40s
CI / Tests (push) Successful in 5m54s
CI / Type Check (adzuna-extractor) (push) Successful in 1m8s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m11s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m8s
CI / Type Check (orchestrator) (push) Successful in 1m23s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m6s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m7s
CI / Documentation (push) Successful in 1m54s
feat(extractors): expand catalog, smoke coverage, and sourcing docs
Adds Arc.dev, BC T-Net, Eluta, iCIMS tenants, QAJobsBoard, and SmartRecruiters
manifests with registry/settings/UI wiring; registers full extractor list in
smoke-extractors and documents supplementary board access paths. Aligns Careerjet
v4 with the url query parameter and fixes strict typing in QAJobsBoard.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-15 22:36:23 -04:00

257 lines
13 KiB
Plaintext

# =============================================================================
# Job Ops - Environment Variables
# Copy this file to .env and fill in your values
# =============================================================================
MODEL=google/gemini-3-flash-preview
# Self-hosted Ollama (e.g. 16GB GPU): use a 22B-class model for scoring/tailoring; pull the tag on the server first.
# MODEL=mistral-small:22b
# LLM_PROVIDER=ollama
# LLM_BASE_URL=http://127.0.0.1:11434
# Heavier option (~may offload layers to CPU on 16GB): qwen2.5:32b
# DEPRECATED (auto-copied to LLM_API_KEY for compatibility)
# OPENROUTER_API_KEY=your_openrouter_api_key_here
# Self-hosted RxResume base URL, e.g., http://rxresume.local.net
# Defaults to https://v4.rxresu.me
# RXRESUME_URL=
# Optional: load resume JSON from disk instead of the RxResume API (scoring, tailoring, cover letters).
# Path is absolute or relative to the orchestrator process cwd (often `orchestrator/` when using `npm run dev` there).
# Takes precedence over Settings → local path. PDF export still uses RxResume when enabled.
# Example (monorepo): hand-authored v5 JSON may live under `data/resumes/` (that folder is gitignored by default).
# If you use seeded search profiles with `resumeLocalPath` + login auto-activate, leave this unset so Settings → local path wins.
# JOBOPS_LOCAL_RESUME_PATH=../data/resumes/ilia-dobkin.json
# RXResume credentials for PDF generation
# Create an account at: https://v4.rxresu.me
RXRESUME_EMAIL=your_email@example.com
RXRESUME_PASSWORD=your_password_here
# Optional: Basic Auth for write access
# the app is fully unauthenticated if this isn't set, which is the default
# When set, all write actions (POST/PATCH/DELETE) require Basic Auth.
# Optional second user (e.g. paired with a second search profile / `basicAuthUser` in profile JSON):
# BASIC_AUTH_USER_2=
# BASIC_AUTH_PASSWORD_2=
# Example local pairing with DB-seeded profiles (change passwords before exposing the UI):
# BASIC_AUTH_USER=ilia
# BASIC_AUTH_PASSWORD=changeme-ilia
# BASIC_AUTH_USER_2=cherepaha
# BASIC_AUTH_PASSWORD_2=changeme-cherepaha
BASIC_AUTH_USER=
BASIC_AUTH_PASSWORD=
# Optional: client build only — skip RxResume steps in the onboarding wizard (search without PDF export).
# Prefer setting `JOBOPS_LOCAL_RESUME_PATH` above: the API tells the UI to skip RxResume onboarding automatically.
# Otherwise: copy `orchestrator/.env.example` → `orchestrator/.env` and set VITE_SKIP_RXRESUME_ONBOARDING=true
# (Vite only reads `orchestrator/.env`, not this root file.)
# Docker: Vite vars need IMAGE BUILD time (Dockerfile ARG / docker-compose build args), not runtime .env.
# VITE_SKIP_RXRESUME_ONBOARDING=true
# Public base URL used to generate tracer links when PDFs are created by
# background/pipeline runs (where request host cannot be inferred).
# Example: JOBOPS_PUBLIC_BASE_URL=https://jobops.example.com
JOBOPS_PUBLIC_BASE_URL=
# =============================================================================
# Gmail OAuth (Tracking Inbox) - optional
# =============================================================================
# Required to connect Gmail from the UI.
GMAIL_OAUTH_CLIENT_ID=
GMAIL_OAUTH_CLIENT_SECRET=
# Optional override for OAuth callback URL.
# If unset, defaults to <request-origin>/oauth/gmail/callback
# GMAIL_OAUTH_REDIRECT_URI=http://localhost:3005/oauth/gmail/callback
# =============================================================================
# UKVisaJobs (UK visa sponsorship jobs) - optional
# =============================================================================
# Provide email/password for automatic login and token refresh.
# See extractors/ukvisajobs/README.md for detailed instructions.
UKVISAJOBS_EMAIL=
UKVISAJOBS_PASSWORD=
UKVISAJOBS_HEADLESS=true
# =============================================================================
# Adzuna (multi-country API source) - optional
# =============================================================================
# Register at https://developer.adzuna.com/admin/access_details
ADZUNA_APP_ID=
ADZUNA_APP_KEY=
# Default cap per search term (orchestrator run budget / settings can override).
# ADZUNA_MAX_JOBS_PER_TERM=50
# API page size (Adzuna max 50).
# ADZUNA_RESULTS_PER_PAGE=50
# Optional global `where` text for Adzuna. Pipeline runs usually use Settings → search cities
# instead; leave unset unless you want a fixed location for standalone extractor use.
# ADZUNA_LOCATION_QUERY=
# Only for running the extractor CLI alone; the pipeline sets country from your run (us / ca / gb / …).
# ADZUNA_COUNTRY=gb
# =============================================================================
# JobSpy - Job search configuration
# =============================================================================
# Filter for remote-only jobs (default: 0 = disabled)
# JOBSPY_IS_REMOTE=0
# =============================================================================
# USAJOBS API (US federal jobs) - optional, US-only
# =============================================================================
# Register at https://developer.usajobs.gov/APIRequest/Index
# USAJOBS requires a User-Agent that is a real contact email (per their TOS).
# Leave unset to disable the source.
# USAJOBS_API_KEY=
# USAJOBS_USER_AGENT=you@example.com
# USAJOBS_MAX_JOBS_PER_TERM=100
# =============================================================================
# Jobicy (remote jobs feed) - optional, no auth
# =============================================================================
# Public JSON endpoint, capped at 50 results per call.
# JOBICY_MAX_JOBS_PER_TERM=100
# =============================================================================
# The Muse (jobs API) - optional, API key recommended
# =============================================================================
# https://www.themuse.com/developers/api/v2 — works without a key but is
# heavily rate-limited. Set THEMUSE_API_KEY for higher quotas.
# THEMUSE_API_KEY=
# THEMUSE_MAX_JOBS_PER_TERM=100
# =============================================================================
# Jooble (aggregator API) - optional
# =============================================================================
# Sign up at https://jooble.org/api/about for an API key.
# JOOBLE_API_KEY=
# JOOBLE_MAX_JOBS_PER_TERM=100
# =============================================================================
# Careerjet (publisher API v4) - optional
# =============================================================================
# Register at https://www.careerjet.com/partners/api/ — declare API key + server IP(s).
# CAREERJET_AFFID=your_api_key
# CAREERJET_REFERER=https://your-site.com/path-to-job-search/
# CAREERJET_USER_IP=203.0.113.1
# Optional override for the required user_agent query param:
# CAREERJET_USER_AGENT=Mozilla/5.0 ...
# CAREERJET_MAX_JOBS_PER_TERM=100
# =============================================================================
# Reed.co.uk (UK jobs API) - optional, UK-only
# =============================================================================
# Register at https://www.reed.co.uk/developers/jobseeker for an API key.
# REED_API_KEY=
# REED_MAX_JOBS_PER_TERM=100
# =============================================================================
# Remote OK (remote jobs feed) - optional, no auth
# =============================================================================
# Public single-shot JSON feed at https://remoteok.com/api. We filter
# client-side by your search terms (matched against position + tags).
# Per Remote OK's TOS, link back to the original posting URLs when republishing.
# REMOTEOK_MAX_JOBS_PER_TERM=100
# =============================================================================
# Remotive (remote jobs feed) - optional, no auth
# =============================================================================
# Public JSON API at https://remotive.com/api/remote-jobs?limit=N&search=term.
# Each search term is sent as the `search` parameter.
# REMOTIVE_MAX_JOBS_PER_TERM=100
# =============================================================================
# Arbeitnow (multi-ATS aggregator) - optional, no auth
# =============================================================================
# Public JSON API at https://www.arbeitnow.com/api/job-board-api?page=N.
# Aggregates from Greenhouse, SmartRecruiters, Join, TeamTailor, Recruitee,
# and Comeet. No server-side search; filtering is done client-side.
# ARBEITNOW_MAX_JOBS_PER_TERM=100
# =============================================================================
# Himalayas (remote jobs feed) - optional, no auth
# =============================================================================
# Public JSON API at https://himalayas.app/jobs/api?limit=N&offset=M.
# No server-side search; filtering is done client-side by title + categories.
# HIMALAYAS_MAX_JOBS_PER_TERM=100
# =============================================================================
# We Work Remotely (RSS feed) - optional, no auth
# =============================================================================
# Public RSS at https://weworkremotely.com/remote-jobs.rss (all categories).
# Single fetch; filtering is done client-side by title + skills + category.
# WEWORKREMOTELY_MAX_JOBS_PER_TERM=100
# =============================================================================
# 4 Day Week (reduced-schedule jobs) - optional, no auth
# =============================================================================
# Public JSON API at https://4dayweek.io/api/jobs?page=N.
# Paginated; filtering is done client-side by title + tech stack.
# No job description in listings; links to 4dayweek.io for details.
# FOURDAYWEEK_MAX_JOBS_PER_TERM=100
# =============================================================================
# Public ATS sources (Lever / Ashby / Greenhouse) - optional
# =============================================================================
# Comma- or newline-separated company slugs. The slug is the path segment used
# in each provider's public job board, e.g. `lever.co/some-company` → "some-company".
# LEVER_COMPANIES=netflix,figma
# ASHBY_COMPANIES=ramp,linear
# GREENHOUSE_COMPANIES=stripe,airbnb
# Canadian QA-employer examples (full table): docs-site/docs/extractors/canadian-companies-qa-ats.md
# =============================================================================
# Workday (public career sites) - optional
# =============================================================================
# Newline- or comma-separated entries. Each entry is either:
# 1) A career-site URL we'll auto-parse, e.g.
# https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite
# 2) A JSON object with explicit fields:
# {"company":"NVIDIA","tenantUrl":"https://nvidia.wd5.myworkdayjobs.com","tenant":"nvidia","site":"NVIDIAExternalCareerSite","locale":"en-US"}
# WORKDAY_TENANTS=
# =============================================================================
# SmartRecruiters (public Posting API) - optional
# =============================================================================
# Comma- or newline-separated company identifiers (API path segment), e.g.
# jobs.smartrecruiters.com/smartrecruiters/... → "smartrecruiters".
# SMARTRECRUITERS_COMPANIES=smartrecruiters
# SMARTRECRUITERS_MAX_JOBS_PER_COMPANY=100
# =============================================================================
# Eluta (Canada, RSS by location) - optional
# =============================================================================
# Comma- or newline-separated location strings for https://www.eluta.ca/rss?location=...
# Example: ELUTA_RSS_LOCATIONS=Toronto, ON|Vancouver, BC
# ELUTA_MAX_JOBS_PER_TERM=100
# =============================================================================
# BC T-Net (British Columbia tech jobs RSS) — optional
# =============================================================================
# Default feed is built into the extractor when this is unset:
# https://www.bctechnology.com/rss/jobs/tnetjobs.xml
# Override with JSON array or newline-separated URLs (custom feeds from T-Net builder).
# BCTENET_RSS_URLS=
# Prefer Settings: bctenetRssUrls (JSON array), bctenetMaxJobsPerTerm (default 400).
# =============================================================================
# iCIMS tenant portals (anonymous HTML search) — optional
# =============================================================================
# Comma- or newline-separated hosts, e.g. careers-example.icims.com
# ICIMS_TENANTS=
# Caps via Settings: icimsMaxJobsPerTenant (default 250), icimsMaxPagesPerSearch (default 10).
# =============================================================================
# QAJobsBoard (QA JobBoardly JSON) — optional
# =============================================================================
# Configure caps via Settings: qajobsboardMaxJobsPerTerm (default 100).
# =============================================================================
# Arc.dev remote listings — optional
# =============================================================================
# Comma-separated paths under https://arc.dev used when seeding defaults (e.g. Playwright + Cypress feeds).
# ARC_REMOTE_JOBS_PATHS=/remote-jobs/playwright,/remote-jobs/cypress
# Prefer Settings for overrides: arcRemoteJobsPaths (JSON array), arcMaxJobsPerPath (default 120).