# Deploy on a VM, run the pipeline on a schedule, notify Telegram

End-to-end checklist:

1. Push this repo to your Git host (Gitea, GitHub, etc.).
2. On the server: install Docker + Compose v2, clone the repo, copy `.env.example` → `.env`, run `docker compose up -d`.
3. Confirm the UI/API on the mapped host port (default **3005** → container **3001**).
4. Add a cron job that `POST`s `/api/pipeline/run` (see §3).
5. Optional: Telegram via `scripts/jobber-pipeline-telegram.sh`, pipeline webhook relay, or n8n (see §4).

---

## Git remote (example)

Replace host, user/org, and repo name with yours:

```bash
git remote add gitea git@GITEA_HOST:YOUR_USER/Jobber.git
# or: git remote set-url gitea ...
git push -u gitea main
```

If you have uncommitted changes:

```bash
git add -A && git commit -m "Your message" && git push gitea main
```

---

## 1. Deploy on a Linux VM (bare metal or cloud)

1. Install **Docker** and **Docker Compose** (plugin v2).
2. Clone from your Git server (SSH or HTTPS):

   ```bash
   git clone git@GITEA_HOST:YOUR_USER/Jobber.git
   cd Jobber
   ```

3. Environment:

   ```bash
   cp .env.example .env
   # Edit .env: MODEL / LLM keys, RXRESUME_*, search settings, etc.
   ```

4. Start:

   ```bash
   docker compose up -d --build
   ```

   The image build sets `VITE_SKIP_RXRESUME_ONBOARDING=true` by default so the first-run wizard only asks for **LLM** (Reactive Resume / PDF template steps are skipped; configure those in **Settings** if needed). Rebuild after changing that build arg.

5. Open the UI: `http://<VM-IP>:3005` (port mapped in `docker-compose.yml`).

6. Persist data: compose mounts `./data` — back up that directory.

---

## 2. Deploy as a container (same image, any host)

Same as the VM path: only Docker is required.

- Ensure port **3005** (or your chosen host port) is reachable if you use the UI from another machine.
- For **only** API/cron from localhost, bind to `127.0.0.1:3005` by changing the `ports:` line in `docker-compose.yml` (e.g. `"127.0.0.1:3005:3001"`).

Inside the container the app listens on **3001**; the default host map is **3005 → 3001**.

**Cron on the host** should call the API on the host:

- Browser: `http://127.0.0.1:3005`
- **API**: same origin; `/api/...` is served by the app.

If the API is only on a Docker network, use the container name and port `3001` from another container, or publish `3005` on the host and use `127.0.0.1:3005` from cron.

---

## 3. Run the pipeline on a schedule (cron)

`POST /api/pipeline/run` **starts** the pipeline in the **background** and returns quickly (`{ ok: true, data: { message: "Pipeline started" } }`). That is enough for scheduling.

Example crontab (host timezone — adjust hours):

```cron
# 08:00, 14:00, 20:00 daily
0 8,14,20 * * * /usr/local/bin/jobops-pipeline-run.sh >> /var/log/jobops-pipeline.log 2>&1
```

Create `/usr/local/bin/jobops-pipeline-run.sh`:

```bash
#!/usr/bin/env bash
set -euo pipefail
BASE_URL="${JOBOPS_URL:-http://127.0.0.1:3005}"
# If BASIC_AUTH_USER / BASIC_AUTH_PASSWORD are set in .env, uncomment:
# AUTH=(-u "${BASIC_AUTH_USER:?}:${BASIC_AUTH_PASSWORD:?}")

curl -sS -X POST "${BASE_URL}/api/pipeline/run" \
  -H "Content-Type: application/json" \
  -d '{}' \
  "${AUTH[@]:-}" \
  | tee -a /var/log/jobops-pipeline.log
echo >> /var/log/jobops-pipeline.log
```

```bash
sudo chmod +x /usr/local/bin/jobops-pipeline-run.sh
```

Set `JOBOPS_URL` in root’s crontab or `/etc/environment` if the app is on another host.

**Basic auth:** When `BASIC_AUTH_USER` and `BASIC_AUTH_PASSWORD` are in `.env`, non-GET API calls need Basic auth — use `curl -u user:pass` as above.

---

## 4. Telegram notifications

The app does **not** send Telegram by itself. Practical options:

### Option A — Pipeline webhook (recommended)

1. In the app: **Settings → Webhooks** (or env `PIPELINE_WEBHOOK_URL` / `WEBHOOK_SECRET`) set a URL that receives JSON when a run **completes or fails**.
2. Point that URL to a small relay that maps the JSON to Telegram `sendMessage`.

Telegram HTTP API:

```text
https://api.telegram.org/bot<BOT_TOKEN>/sendMessage
```

Body (JSON):

```json
{
  "chat_id": "<YOUR_CHAT_ID>",
  "text": "Pipeline finished: ..."
}
```

Host the relay on the same VM (Flask/FastAPI/Node, or n8n). Keep **bot token** and **chat id** in environment variables.

Payload shape (sanitized) includes fields like `event`, `pipelineRunId`, `jobsDiscovered`, `jobsProcessed`, `error` — see `orchestrator/src/server/pipeline/steps/notify-webhook.ts`.

### Option B — Shipped script: run pipeline + Telegram summary (cron-friendly)

The repo includes `scripts/jobber-pipeline-telegram.sh`: it `POST`s `/api/pipeline/run`, polls `GET /api/pipeline/status` until the run ends, then sends one Telegram with **status**, **jobsDiscovered**, and **jobsProcessed** (and **errorMessage** if failed).

**1. Dependencies on the host** (LXC/VM that runs cron):

```bash
apt-get update && apt-get install -y jq curl
```

**2. Install script and secrets** (after `git pull` in `/opt/Jobber` or your clone path):

```bash
install -m 755 /opt/Jobber/scripts/jobber-pipeline-telegram.sh /usr/local/bin/jobber-pipeline-telegram.sh
cp /opt/Jobber/scripts/jobber-cron.env.example /root/.jobber-cron.env
chmod 600 /root/.jobber-cron.env
nano /root/.jobber-cron.env
```

Fill **`TELEGRAM_BOT_TOKEN`** (from @BotFather) and **`TELEGRAM_CHAT_ID`**. For a **private** chat with your bot, use `message.chat.id` from `getUpdates` (same as your Telegram user id in the JSON). **`JOBOPS_URL`** defaults to `http://127.0.0.1:3005` when Jobber runs on the same host.

**3. Manual test** (before cron):

```bash
/usr/local/bin/jobber-pipeline-telegram.sh
```

You should get one Telegram when the pipeline finishes. Optional log: append `>> /var/log/jobber-pipeline.log 2>&1` on the cron line.

**4. Cron** (host **local** timezone — check with `timedatectl` — `crontab -e` as root):

```cron
# 09:00, 13:00, 18:00 daily — pipeline + Telegram summary
0 9,13,18 * * * /usr/local/bin/jobber-pipeline-telegram.sh >> /var/log/jobber-pipeline.log 2>&1
```

Other examples: `0 8,14,20 * * *` for 08:00 / 14:00 / 20:00.

**5. Pull latest code and redeploy** (on the VM, from the repo root, e.g. `/opt/Jobber`):

```bash
cd /opt/Jobber
git fetch origin && git pull --ff-only
install -m 755 scripts/jobber-pipeline-telegram.sh /usr/local/bin/jobber-pipeline-telegram.sh
docker compose up -d --build
```

Wait until `curl -sf http://127.0.0.1:3005/health` succeeds before relying on cron (container needs a few seconds after start).

**5b. Copy your local SQLite to the VM (profiles, settings, jobs)** — optional; use when you want the same **search profile**, `activeProfileId`, and job rows as on your laptop.

1. **Stop** the app that holds the DB open: local `npm run dev` (Ctrl+C) and on the VM `docker compose stop` (or `docker stop job-ops`).
2. **Checkpoint WAL** on the machine that owns the canonical DB (usually your laptop), so a copy is self-contained:

   ```bash
   cd /path/to/Jobber
   sqlite3 data/jobs.db "PRAGMA wal_checkpoint(FULL);"
   ```

3. **Copy** a consistent DB file to the VM’s `./data/` (same path Docker mounts). Prefer a **SQLite backup** (avoids WAL races while `npm run dev` is running):

   ```bash
   sqlite3 data/jobs.db ".backup 'data/jobs.deploy.db'"
   scp ./data/jobs.deploy.db YOUR_USER@10.0.10.178:/opt/Jobber/data/jobs.db
   rm -f data/jobs.deploy.db
   ```

   On the VM, with the container **stopped**, **delete stale sidecars** or SQLite may report corrupt DB:

   ```bash
   rm -f /opt/Jobber/data/jobs.db-wal /opt/Jobber/data/jobs.db-shm
   ```

   Alternatively `rsync` only `jobs.db` after `PRAGMA wal_checkpoint(FULL)` on the laptop with nothing else writing to the file.

4. On the VM: `docker compose up -d` and verify `GET /api/settings` / the Settings UI shows your profile.

**Default pipeline sources** (empty JSON body to `POST /api/pipeline/run`, e.g. cron script) include **Glassdoor** via JobSpy with Indeed and LinkedIn. Glassdoor’s API often returns errors in logs; LinkedIn/Indeed can still produce rows. To force an explicit list from cron, set `JOBBER_PIPELINE_SOURCES` in `/root/.jobber-cron.env` (see `scripts/jobber-cron.env.example`).

**Security:** Never commit `/root/.jobber-cron.env` or paste bot tokens in Git. Revoke the token in BotFather if it was exposed.

### Option B2 — Minimal curl-only (no wait-for-finish)

If you only want to **trigger** the pipeline from cron without this script, use §3. For a one-off Telegram without polling:

```bash
TELEGRAM_BOT_TOKEN="123456:ABC..."
CHAT_ID="your_numeric_chat_id"
MSG="$(printf 'Pipeline finished. Check dashboard.')"
curl -sS -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
  -H "Content-Type: application/json" \
  -d "{\"chat_id\":\"${CHAT_ID}\",\"text\":$(echo "$MSG" | jq -Rs .)}"
```

**chat_id:** Message your bot, then open `https://api.telegram.org/bot<TOKEN>/getUpdates` and read `message.chat.id` (if `result` is empty, send **Start** to the bot first, or call `deleteWebhook` if a webhook was set).

### Option C — External automation

Use n8n, Grafana OnCall, or similar: schedule → `POST /api/pipeline/run` → wait/poll → Telegram node.

---

## 5. Security notes

- Do not commit `.env` or Telegram tokens.
- Prefer Basic Auth if the instance is reachable from the internet.
- Restrict firewall so only your IP (or VPN) can reach the published port if exposed.

---

## 6. Git remotes (reference)

```bash
git remote -v
git push origin main    # or: git push gitea main — whatever you configured
```

---

## Related

- Env knobs: `PIPELINE_WEBHOOK_URL`, `WEBHOOK_SECRET`, `BASIC_AUTH_USER`, `BASIC_AUTH_PASSWORD` in `.env.example`.
- Local docs: `npm run docs:dev` from the repo root.

---

## 7. Proxmox: VM vs LXC, sizing, fast setup

### VM or container (LXC)?

| | **QEMU VM (recommended)** | **LXC** |
|---|---------------------------|--------|
| **Docker** | Works the same as on any Linux server. | Possible with `nesting=1` (and sometimes `keyctl`); more Proxmox/Docker footguns. |
| **This app** | Playwright/Firefox + Node inside Docker — predictable. | Same stack *can* work in nested Docker, but troubleshooting is harder. |
| **Overhead** | Slightly higher RAM for a full kernel. | Lower overhead per CT. |

**Choose a VM** unless you already run Docker in LXC on this cluster and know the knobs. For speed and fewer surprises: **Ubuntu 24.04 LTS cloud image**, 2–4 vCPU, 4–8 GB RAM, **≥ 40 GB** disk (Docker layers + `./data`).

**Rough sizing**

- **Light personal use:** 2 vCPU, **4 GB RAM**, 40 GB disk — often enough.
- **Comfortable (pipelines + browsers + headroom):** 4 vCPU, **6–8 GB RAM**, 64 GB disk.
- **Tight:** 2 GB RAM can work for idle UI only; **scraping/LLM runs will swap or OOM** — avoid.

### Proxmox UI (once per guest)

1. **Create VM** → ISO or cloud-init image (e.g. Ubuntu 24.04).
2. **Network**: bridge (e.g. `vmbr0`) so the guest gets a LAN IP.
3. **Disk**: virtio, discard on if SSD.
4. **CPU type:** `host` if single-node and you want a tiny perf edge; `kvm64` is fine.
5. After install: **Guest agent** optional but handy for IP in Proxmox UI.

### One-shot shell setup (inside the Ubuntu VM)

Run as a user with `sudo`. Set `REPO_URL` to your Git remote (HTTPS or SSH). First build can take several minutes.

```bash
set -euo pipefail
REPO_URL="${REPO_URL:-https://github.com/YOUR_USER/Jobber.git}"  # change
APP_DIR="${APP_DIR:-$HOME/Jobber}"

sudo apt-get update
sudo apt-get install -y ca-certificates curl git

# Docker Engine + Compose plugin (official convenience script; review if you prefer manual repo install)
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker "$USER"
# Log out and back in so `docker` works without sudo, or use `newgrp docker` for this session:
newgrp docker || true

git clone "$REPO_URL" "$APP_DIR"
cd "$APP_DIR"
cp .env.example .env
echo "Edit .env now (LLM keys, RXRESUME, etc.), then run: docker compose up -d --build"
```

Then edit `.env`, then:

```bash
cd "$APP_DIR"
docker compose up -d --build
```

Open `http://<VM-IP>:3005`. Persist backups of `$APP_DIR/data` and your `.env`.