Jobber/overview.md at 4da264eb48f5cec2056723bb4347a9b184cd410c

Shaheer Sarfaraz 390d03625e

Add documentation for undocumented features (#172 )

* documentation writing skill

* visa sponsors page

* overview

* in progress board

* settings

* reactive resume section

* database backups

* workflows

* post application tracking flow

* manual tracking caveats

* pricing section

* pipeline run detalis

* job search bar

* keyboard shortcuts

* bulk actions

* no informal phrasing

* formatting

* build fix?

* Update docs-site/docs/features/overview.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update docs-site/versioned_docs/version-0.1.20/features/orchestrator.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update docs-site/docs/features/visa-sponsors.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update docs-site/docs/features/in-progress-board.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* remove link to page that don't exist

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

2026-02-16 00:33:35 +00:00

2.5 KiB

Raw Blame History

id, title, description, sidebar_position

id	title	description	sidebar_position
overview	Extractors Overview	Technical index of supported extractors and how they work.	1

This page helps you choose the right extractor for your run, understand key constraints, and navigate to detailed technical guides.

Extractor chooser

Extractor	Best use case	Core constraints/dependencies	Notable controls	Output/behavior notes
Gradcracker	UK graduate roles from Gradcracker	Crawling stability depends on page structure and anti-bot behavior; tuned for low concurrency	`GRADCRACKER_SEARCH_TERMS`, `GRADCRACKER_MAX_JOBS_PER_TERM`, `JOBOPS_SKIP_APPLY_FOR_EXISTING`	Scrapes listing metadata, then detail pages and apply URL resolution
JobSpy	Multi-source discovery (Indeed, LinkedIn, Glassdoor)	Requires Python wrapper execution per term; source availability and quality vary by site/location	`JOBSPY_SITES`, `JOBSPY_SEARCH_TERMS`, `JOBSPY_RESULTS_WANTED`, `JOBSPY_HOURS_OLD`, `JOBSPY_LINKEDIN_FETCH_DESCRIPTION`	Produces JSON per term, then orchestrator normalizes and de-duplicates by `jobUrl`
UKVisaJobs	UK visa sponsorship-focused roles	Requires authenticated session and periodic token/cookie refresh	`UKVISAJOBS_EMAIL`, `UKVISAJOBS_PASSWORD`, `UKVISAJOBS_MAX_JOBS`, `UKVISAJOBS_SEARCH_KEYWORD`	API pagination + dataset output; orchestrator de-dupes and may fetch missing descriptions
Manual Import	One-off jobs not covered by scrapers	Inference quality depends on model/provider and input quality; some URLs cannot be fetched reliably	App/API endpoints (`/api/manual-jobs/infer`, `/api/manual-jobs/import`)	Accepts text/HTML/URL, runs inference, then saves and scores job after review

Which extractor should I use?

Use JobSpy for broad first-pass sourcing across common boards.
Use Gradcracker when targeting graduate pipelines in the UK.
Use UKVisaJobs for sponsorship-specific UK searches.
Use Manual Import when you already have a specific posting and need direct import.

Many runs combine sources: broad discovery first, then manual import for high-priority jobs that scraping misses.

2.5 KiB Raw Blame History

Extractor chooser

Which extractor should I use?

Related extractor docs

2.5 KiB

Raw Blame History