fix(docker): copy full extractors tree into image for runtime manifests
Some checks failed
CI / Linting (Biome) (push) Failing after 35s
CI / Tests (push) Successful in 5m28s
CI / Type Check (adzuna-extractor) (push) Successful in 1m5s
CI / Type Check (gradcracker-extractor) (push) Successful in 1m9s
CI / Type Check (hiringcafe-extractor) (push) Successful in 1m4s
CI / Type Check (orchestrator) (push) Successful in 1m21s
CI / Type Check (startupjobs-extractor) (push) Successful in 1m4s
CI / Type Check (ukvisajobs-extractor) (push) Successful in 1m5s
CI / Documentation (push) Successful in 1m55s

The Dockerfile only copied a fixed list of extractor dirs; new sources
were listed in shared but their manifest.ts files were absent from the
container, so discovery logged missingManifest in production.

Copy extractors/ once before npm install in builder and production, and
skip redundant per-extractor COPY lines. Add extractors/*/storage/ to
.dockerignore to avoid baking local cache into the build context.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
ilia 2026-05-12 20:36:27 -04:00
parent 7b3dfb002a
commit 67508d56ea
2 changed files with 8 additions and 22 deletions

View File

@ -10,6 +10,9 @@
# Data (mounted as volume)
data/
# Extractor runtime caches (optional local dirs; never needed in image)
extractors/*/storage/
# Environment files (passed via docker-compose)
.env
**/.env

View File

@ -35,11 +35,8 @@ COPY package*.json ./
COPY docs-site/package*.json ./docs-site/
COPY shared/package*.json ./shared/
COPY orchestrator/package*.json ./orchestrator/
COPY extractors/adzuna/package*.json ./extractors/adzuna/
COPY extractors/hiringcafe/package*.json ./extractors/hiringcafe/
COPY extractors/gradcracker/package*.json ./extractors/gradcracker/
COPY extractors/startupjobs/package*.json ./extractors/startupjobs/
COPY extractors/ukvisajobs/package*.json ./extractors/ukvisajobs/
# All npm workspaces under extractors/* (manifests + package.json per extractor)
COPY extractors ./extractors
# Install Node dependencies with npm cache (dev deps needed for build)
RUN --mount=type=cache,target=/root/.npm \
@ -56,12 +53,7 @@ COPY shared ./shared
COPY docs-site ./docs-site
COPY orchestrator ./orchestrator
COPY visa-sponsor-providers ./visa-sponsor-providers
COPY extractors/adzuna ./extractors/adzuna
COPY extractors/hiringcafe ./extractors/hiringcafe
COPY extractors/gradcracker ./extractors/gradcracker
COPY extractors/jobspy ./extractors/jobspy
COPY extractors/startupjobs ./extractors/startupjobs
COPY extractors/ukvisajobs ./extractors/ukvisajobs
# extractors/ already copied before npm install (full tree for manifests at runtime)
# Build documentation site bundle
WORKDIR /app/docs-site
@ -107,11 +99,7 @@ COPY package*.json ./
COPY docs-site/package*.json ./docs-site/
COPY shared/package*.json ./shared/
COPY orchestrator/package*.json ./orchestrator/
COPY extractors/adzuna/package*.json ./extractors/adzuna/
COPY extractors/hiringcafe/package*.json ./extractors/hiringcafe/
COPY extractors/gradcracker/package*.json ./extractors/gradcracker/
COPY extractors/startupjobs/package*.json ./extractors/startupjobs/
COPY extractors/ukvisajobs/package*.json ./extractors/ukvisajobs/
COPY extractors ./extractors
# Install production Node dependencies only
RUN --mount=type=cache,target=/root/.npm \
@ -124,12 +112,7 @@ COPY --from=builder /app/docs-site/build ./orchestrator/dist/docs
COPY shared ./shared
COPY orchestrator ./orchestrator
COPY visa-sponsor-providers ./visa-sponsor-providers
COPY extractors/adzuna ./extractors/adzuna
COPY extractors/hiringcafe ./extractors/hiringcafe
COPY extractors/gradcracker ./extractors/gradcracker
COPY extractors/jobspy ./extractors/jobspy
COPY extractors/startupjobs ./extractors/startupjobs
COPY extractors/ukvisajobs ./extractors/ukvisajobs
# extractors/ already copied before npm install
# Reuse Camoufox binaries from builder instead of fetching again
COPY --from=builder /root/.cache/camoufox /root/.cache/camoufox