20 Commits

Author SHA1 Message Date
0a63316100 fix(discovery): block countries in vague locations via job description
Some checks failed
CI / Linting (Biome) (push) Failing after 41s
CI / Tests (push) Successful in 5m22s
CI / Type Check (adzuna-extractor) (push) Successful in 1m9s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m14s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m11s
CI / Type Check (orchestrator) (push) Successful in 1m28s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m13s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m12s
CI / Documentation (push) Successful in 2m0s
QAJobsBoard and similar feeds often store Worldwide/Remote while the real
country is only in the description. Scan title and description when location
is vague, and prefer concrete locations from QAJobsBoard postings.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 17:15:18 -04:00
f5179304c1 feat(discovery): blocked countries filter and smoke subprocess fixes
Some checks failed
CI / Linting (Biome) (push) Failing after 41s
CI / Tests (push) Successful in 5m27s
CI / Type Check (adzuna-extractor) (push) Successful in 1m9s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m13s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m9s
CI / Type Check (orchestrator) (push) Successful in 1m24s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m8s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m9s
CI / Documentation (push) Successful in 1m59s
Add blockedCountries in Settings so pipeline discovery drops jobs whose
location mentions listed countries (existing discovered rows are kept).
Document the feature, fix smoke tsconfig inheritance for nested extractors,
and run smoke via an absolute-tsconfig wrapper.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 11:41:29 -04:00
c840f289e1 feat(extractors): expand catalog, smoke coverage, and sourcing docs
Some checks failed
CI / Linting (Biome) (push) Failing after 40s
CI / Tests (push) Successful in 5m54s
CI / Type Check (adzuna-extractor) (push) Successful in 1m8s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m11s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m8s
CI / Type Check (orchestrator) (push) Successful in 1m23s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m6s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m7s
CI / Documentation (push) Successful in 1m54s
Adds Arc.dev, BC T-Net, Eluta, iCIMS tenants, QAJobsBoard, and SmartRecruiters
manifests with registry/settings/UI wiring; registers full extractor list in
smoke-extractors and documents supplementary board access paths. Aligns Careerjet
v4 with the url query parameter and fixes strict typing in QAJobsBoard.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-15 22:36:23 -04:00
7b3dfb002a feat(extractors): add 17 job source extractors and cross-source dedup
Some checks failed
CI / Linting (Biome) (push) Failing after 36s
CI / Tests (push) Successful in 5m54s
CI / Type Check (adzuna-extractor) (push) Successful in 1m6s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m9s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m5s
CI / Type Check (orchestrator) (push) Successful in 1m21s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m4s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m4s
CI / Documentation (push) Successful in 1m52s
Adds extractor packages: arbeitnow, ashby, careerjet, fourdayweek,
greenhouse, himalayas, jobicy, jooble, lever, reed, remoteok, remotive,
themuse, usajobs, weworkremotely, workday — each with manifest, package
metadata and README.

Pipeline / shared:
- shared/job-fingerprint: stable hash for cross-source dedup, with tests
- discover-jobs: dedup via fingerprint and richer per-source merging
- jobs repository: fingerprint-aware upsert / lookup
- settings-registry, settings types/routes, demo-defaults: knobs for the
  new sources
- shared extractors index: register the new manifests
- location-support, profiles route: small fixes for the new sources

Tooling:
- scripts/smoke-extractors.ts to sanity-check each source locally
- scripts/jobber-cron-{cherepaha,dobkin}.env.example: per-host cron
  templates (CHANGEME placeholders only)
- .env.example: documented env vars for the new extractors
- .gitignore: ignore extractors/*/storage/ runtime caches (was ukvisajobs only)

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 20:17:52 -04:00
b72612fd06 Add Basic Auth gate, job owner context, and multi-tenant job isolation.
- Server: auth routes, owner profile on requests/jobs/pipeline, migrations
- Client: BasicAuthAppGate, SSE/API session handling, profile quick switch
- Tests: tracer-links, ghostwriter request-context mock, pipeline coverage
- Env examples for cron and optional basic auth credentials

Made-with: Cursor
2026-04-20 21:34:42 -04:00
e39341258a feat(orchestrator): blocked company filters and keyword parsing
- Add shared helper to parse blocked company keywords from JSON or legacy text
- Wire discover-jobs to resolved keywords for employer blocking; extend filters UI
- Include vite/client in TS types for import.meta.env; fix JobListItem test fixture

Made-with: Cursor
2026-04-11 12:04:38 -04:00
fea00ae656 feat: search profiles, cover letters, discovery fixes
- Add search profiles (DB, API, settings UI) and wire into scorer/pipeline search terms.
- Add cover letter generation (service, job action, JobDetail UI).
- Align JobSpy Indeed country with country-level search geography when settings conflict; warn in logs.
- Infer country from search cities via inferCountryKeyFromSearchGeography (shared).
- Ignore extractor venv/storage and local data in Biome; ignore orchestrator/storage and JobSpy .venv in git.
- Vite: do not watch orchestrator/storage (prevents reloads during startup.jobs pipeline).
- JobSpy: document Python 3.10+ and venv setup in README/requirements.
- Onboarding and settings: local resume path handling, orchestrator .env.example for Vite.

Made-with: Cursor
2026-04-05 19:35:14 -04:00
Shaheer Sarfaraz
3da5ea35b4
Deduplicate shared helpers and enforce aliased imports (#228)
* Deduplicate string cleanup helpers and not-found responses

* Enforce aliased imports for infra and shared modules

* Enforce @client/@server aliases for deep relative imports

* Deduplicate visa sponsor and location filter definitions

* Use shared city filter export in extractor location checks
2026-02-22 16:13:52 +00:00
Shaheer Sarfaraz
82e142a8a8
Auto-Registering Extractor System (#223)
* initial commit?

* Address PR feedback on extractor discovery and startup resilience

* Address latest PR review comments

* fix city resolution fallback when input parses empty

* address PR feedback on extractor registry and pipeline validation

* address copilot comments on manifests and registry startup

* fix extractor discovery export handling and env isolation in tests

* enforce duplicate manifest id failures in strict mode

* Fix remaining extractor registry and runtime review comments

* docs

* docs

* test all, logic remains in extractors

* Address PR review feedback on extractor registry and validation

* Revert extractor moduleResolution to bundler

* Enforce shared city filtering across all discovery sources

* Deduplicate extractor strict city post-filtering
2026-02-21 17:44:07 +00:00
Shaheer Sarfaraz
cc7cacd7f5
Feat/company blacklist tokenized input (#219)
* initial commit

* docs mention!

* Update orchestrator/src/server/pipeline/steps/discover-jobs.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* normalizeStringArray

* poppier orange

* comments

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-21 04:07:06 +00:00
Shaheer Sarfaraz
19266fe5eb
City search (#217)
* wave 1, jobspy only

* combine usa/ca to united states

* strict city location filter

* hide and show based on focus

* UI changes

* allow clicking cross!

* pill animate in

* animate out, uggo fix

* animate out

* framer motion

* animate component height

* adzuna

* hiring cafe implementation

* refactor: centralize shared search-city parsing and matching

* feat: migrate city setting to searchCities with legacy fallback

* docs: update pipeline and extractor city-search wording

* fix(orchestrator): normalize tokenized paste behavior

* fix(shared): tighten city matching semantics

* docs(extractors): document city-location knobs and geocoding note
2026-02-21 00:42:09 +00:00
Shaheer Sarfaraz
f3c164d252
feat(pipeline): parallelize discovery/process via evolved asyncPool (#211)
* feat(pipeline): centralize concurrency hooks and parallelize discovery/process steps

* feat(orchestrator): unify single and bulk job actions API

* job actions de-bulk-ified

* application inbox section debulk

* chore(orchestrator): remove remaining bulk wording from job action flow

* select multiple to skip with shortcut

* comments

* coomeents

* fix progress ordinal and add jobs actions payload examples
2026-02-20 16:49:13 +00:00
Shaheer Sarfaraz
d34a9f041b
Hiring cafe extractor (#192)
* feat(hiringcafe): register new source across shared/server/client enums

* feat(hiringcafe-extractor): add browser-backed Hiring Cafe dataset extractor

* feat(orchestrator): integrate Hiring Cafe discovery service into pipeline

* feat(orchestrator-ui): add Hiring Cafe to source availability and run estimates

* chore(hiringcafe): wire CI/docker and add extractor documentation

* chore(format): apply biome formatting for Hiring Cafe integration

* add original websites

* coomints

* number or null
2026-02-19 12:51:55 +00:00
Shaheer Sarfaraz
c5c6675f04
feat: add Adzuna extractor with orchestrator integration (#177)
* feat(settings): add adzuna source fields and country compatibility

* feat(discovery): integrate adzuna extractor into pipeline

* feat(client): wire adzuna in source selection and run budgeting

* docs(extractors): add adzuna guide and configuration notes

* chore(workspaces): register adzuna extractor in lockfile

* fix(adzuna): run extractor via npm script instead of npx

* fix(adzuna): execute extractor via node+tsx without shell

* fix(adzuna): prefer npm run start without shell, fallback to tsx

* fix(docker): include adzuna extractor workspace in image

* chore(adzuna): reuse shared type-conversion utilities

* type-check adzuna

* formatting

* deeedooop

* better instructions
2026-02-17 16:49:42 +00:00
Shaheer Sarfaraz
fe0aebe01a
Small bits and bobs, codebase quality (#129)
* initial change

* nav highlighting

* icon change

* deeeedoooop

* text

* show version number on all pages

* icon

* remove unused code

* add knip

* formatting

* remove unused code

* types fix

* remove notion completely from the codebase.

* update test for new url structure

* clean up the fucking shop boys

* make a "create job" factory and use that

* moar factories

* formatting
2026-02-10 20:01:58 +00:00
Shaheer Sarfaraz
4e1ea28301
Enable Glassdoor as a JobSpy source (#126)
* feat(shared): add glassdoor to job source model

* feat(jobspy): support glassdoor site in scraper and discovery

* feat(pipeline): include glassdoor in source selection and API schema

* feat(ui): add glassdoor toggle to jobspy settings and run estimates

* test/docs: cover glassdoor jobspy integration end-to-end

* fix(jobspy): make glassdoor always-on without settings toggle

* fix(jobspy): fallback glassdoor when location is country-level

* refactor(jobspy): drop direct pandas usage in wrapper

* feat(pipeline): gate glassdoor by supported countries

* fix(jobspy): restore pandas output and keep glassdoor disable copy

* fix(jobspy): map country-level glassdoor searches to city fallbacks

* feat(ui): require glassdoor city for country-level runs
2026-02-10 17:57:49 +00:00
Shaheer Sarfaraz
bd6834f99e
Hotfix location in pipeline search (#108)
* feat(shared): centralize supported country list and source-country rules

* feat(orchestrator): add country selector and UK-only source gating in automatic run modal

* feat(orchestrator): persist country selection and run only compatible extractors

* fix(pipeline): enforce country-source compatibility during discovery

* test(orchestrator): cover country-based source gating and pipeline enforcement

* formatting

* test fix

* lint

* comments

* prevent auto focus grab

* verification

* command and popover

* make sure scroll is working
2026-02-08 13:02:52 +00:00
Shaheer Sarfaraz
a409aa5ee0
Live scraping updates in pipeline UI (#100)
* initial commit

* fix clear script

* cancelling pipelines

* formatting
2026-02-07 22:44:00 +00:00
Shaheer Sarfaraz
16a8f1d15a
Use logger! add shim to convert backend responses to same format (#84)
* chore(orchestrator): add @infra import alias

* feat(server): add error/http/context/logger/sanitize infrastructure

* refactor(core): propagate request context, structured logs, and sanitization

* test/docs: update API contract assertions and contributor standards

* all pages working

* normalizing
2026-02-04 23:07:24 +00:00
Shaheer Sarfaraz
82b261c7bc
Refactor LLM service into modular adapters and policies (#83)
* llm migration

* orchestrator runer

* Decompose runPipeline steps

* dedupe

* refactor(settings): unify settings conversion metadata and round-trip tests

* refactor(llm): extract shared provider strategy factory

* refactor(settings-ui): add reusable numeric setting section

* test(orchestrator): stabilize usePipelineSources localStorage setup

* comments
2026-02-04 21:48:28 +00:00