Some checks failed
CI / Linting (Biome) (push) Failing after 41s
CI / Tests (push) Successful in 5m25s
CI / Type Check (adzuna-extractor) (push) Successful in 1m8s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m12s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m9s
CI / Type Check (orchestrator) (push) Successful in 1m25s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m9s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m9s
CI / Documentation (push) Failing after 1m56s
Dedup by employer+title and description at import; cascade skip on dismiss; hide repeats in the job list. Document product scope and duplicate detection in docs. Co-authored-by: Cursor <cursoragent@cursor.com>
153 lines
5.5 KiB
Markdown
153 lines
5.5 KiB
Markdown
---
|
|
id: pipeline-run
|
|
title: Pipeline Run
|
|
description: How to use Run Mode (Automatic vs Manual), presets, source controls, and advanced run settings.
|
|
sidebar_position: 2
|
|
---
|
|
|
|
## What it is
|
|
|
|
Pipeline Run is the Jobs-page run modal for starting either:
|
|
|
|
- an **Automatic** pipeline run
|
|
- a **Manual** one-job import
|
|
|
|
For end-to-end sequence, read [Find Jobs and Apply Workflow](/docs/next/workflows/find-jobs-and-apply-workflow).
|
|
For manual import internals, read [Manual Import Extractor](/docs/next/extractors/manual).
|
|
|
|
## Why it exists
|
|
|
|
The modal provides one place to control run volume, source compatibility, and processing aggressiveness before consuming compute/time.
|
|
|
|
It helps you:
|
|
|
|
- choose speed vs depth with presets
|
|
- avoid invalid source/country combinations
|
|
- understand estimated run cost before starting
|
|
|
|
## How to use it
|
|
|
|
1. Open the Jobs page and use the top-right run control.
|
|
2. Choose either **Automatic** or **Manual** tab.
|
|
3. Configure required inputs and start run.
|
|
|
|
### Automatic tab
|
|
|
|
#### Presets
|
|
|
|
Three presets set defaults for run aggressiveness:
|
|
|
|
- **Fast**: lower processing volume, higher score threshold
|
|
- **Balanced**: middle-ground defaults
|
|
- **Detailed**: higher processing volume, lower score threshold
|
|
|
|
If values are edited manually, the UI shows **Custom**.
|
|
|
|
#### Country and source compatibility
|
|
|
|
- Country selection affects which sources are available.
|
|
- UK-only sources are disabled for non-UK countries.
|
|
- Adzuna is available only for its supported countries and when App ID/App Key are configured in Settings.
|
|
- Glassdoor can be enabled only when:
|
|
- selected country supports Glassdoor
|
|
- at least one **Search city** is set in Advanced settings
|
|
|
|
Incompatible sources are disabled with explanatory tooltips.
|
|
|
|
#### Advanced settings
|
|
|
|
- **Resumes tailored** (`topN`)
|
|
- **Min suitability score**
|
|
- **Max jobs discovered** (run budget cap)
|
|
- **Search cities** (optional multi-city input; required for Glassdoor)
|
|
- **Workplace type** (`Remote`, `Hybrid`, `Onsite`)
|
|
|
|
Workplace type applies globally to the run across all search terms and locations.
|
|
|
|
Source behavior differs:
|
|
|
|
- Hiring Cafe and startup.jobs support all three workplace types directly.
|
|
- Indeed, LinkedIn, and Glassdoor are backed by JobSpy and only support strict remote filtering.
|
|
- If workplace type is set to `Remote` only, JobSpy runs with a remote-only filter.
|
|
- If `Hybrid` or `Onsite` is included, JobSpy sources remain enabled but may return broader results.
|
|
|
|
#### Search terms
|
|
|
|
- Add terms with Enter or commas.
|
|
- Multiple terms increase discovery breadth and runtime.
|
|
- At least one search term is required.
|
|
|
|
#### Estimate and run gating
|
|
|
|
The footer estimate shows expected discovered jobs and resume-processing range.
|
|
|
|
`Start run now` is disabled when:
|
|
|
|
- a run is already in progress
|
|
- required save/run work is still in progress
|
|
- no compatible sources are selected
|
|
- no search terms are present
|
|
|
|
### Manual tab
|
|
|
|
Manual mode opens direct import flow in the same modal.
|
|
|
|
Use it when you already have a specific job description or link and do not want full discovery.
|
|
|
|
For accepted input formats, inference behavior, and limits, see [Manual Import Extractor](/docs/next/extractors/manual).
|
|
|
|
## Discovery deduplication
|
|
|
|
When new listings are imported, JobOps does not create a second database row if the job is already in your workspace (any status). Matching uses:
|
|
|
|
- a **canonical job URL** (normalizes `http`/`https`, `www`, trailing slashes, common tracking query params, and sorts remaining query keys)
|
|
- the pair **`source` + `source_job_id`** when the extractor provides an external id
|
|
- a **content fingerprint** (normalized **employer + title**) so the same role from another board is not imported twice
|
|
- **skip/apply memory** — imports that match a job you already skipped or applied are not added
|
|
|
|
See [Duplicate job detection](./duplicate-jobs) for skip cascades and description matching.
|
|
|
|
To drop listings before import, use **Settings → Scoring Settings** and pipeline geography:
|
|
|
|
- [Company skip list](./company-skip-list) — blocked **employer** keywords
|
|
- [Blocked countries](./blocked-countries) — block specific countries; when search geography is a country (for example Canada), enforce that country only
|
|
|
|
## Common problems
|
|
|
|
### Start button stays disabled
|
|
|
|
- Ensure at least one search term is present.
|
|
- Ensure at least one compatible source is selected.
|
|
- Wait for active save/run operations to finish.
|
|
|
|
### Glassdoor cannot be enabled
|
|
|
|
- Verify selected country supports Glassdoor.
|
|
- Set at least one Search city in Advanced settings.
|
|
|
|
### Adzuna is not selectable
|
|
|
|
- Set `Adzuna App ID` and `Adzuna App Key` in **Settings > Environment & Accounts**.
|
|
- Verify the selected country is one of Adzuna's supported markets.
|
|
|
|
### Run takes longer than expected
|
|
|
|
- Reduce term count.
|
|
- Use `Fast` preset or lower `Max jobs discovered`.
|
|
- Disable high-cost source combinations where acceptable.
|
|
|
|
### JobSpy results are broader than the selected workplace type
|
|
|
|
- Indeed, LinkedIn, and Glassdoor only support strict remote filtering in this flow.
|
|
- Use `Remote` only when you need JobSpy sources filtered tightly.
|
|
- Hybrid or onsite selections are honored by Hiring Cafe and startup.jobs, but JobSpy-backed sources may still include broader results.
|
|
|
|
## Related pages
|
|
|
|
- [Company skip list](./company-skip-list)
|
|
- [Blocked countries](./blocked-countries)
|
|
- [Find Jobs and Apply Workflow](/docs/next/workflows/find-jobs-and-apply-workflow)
|
|
- [Manual Import Extractor](/docs/next/extractors/manual)
|
|
- [Orchestrator](/docs/next/features/orchestrator)
|
|
- [Overview](/docs/next/features/overview)
|