Jobber/DEPLOY_GITEA_VM_CRON_TELEGRAM.md

284 lines
9.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Deploy on a VM, run the pipeline on a schedule, notify Telegram
End-to-end checklist:
1. Push this repo to your Git host (Gitea, GitHub, etc.).
2. On the server: install Docker + Compose v2, clone the repo, copy `.env.example``.env`, run `docker compose up -d`.
3. Confirm the UI/API on the mapped host port (default **3005** → container **3001**).
4. Add a cron job that `POST`s `/api/pipeline/run` (see §3).
5. Optional: Telegram via `scripts/jobber-pipeline-telegram.sh`, pipeline webhook relay, or n8n (see §4).
---
## Git remote (example)
Replace host, user/org, and repo name with yours:
```bash
git remote add gitea git@GITEA_HOST:YOUR_USER/Jobber.git
# or: git remote set-url gitea ...
git push -u gitea main
```
If you have uncommitted changes:
```bash
git add -A && git commit -m "Your message" && git push gitea main
```
---
## 1. Deploy on a Linux VM (bare metal or cloud)
1. Install **Docker** and **Docker Compose** (plugin v2).
2. Clone from your Git server (SSH or HTTPS):
```bash
git clone git@GITEA_HOST:YOUR_USER/Jobber.git
cd Jobber
```
3. Environment:
```bash
cp .env.example .env
# Edit .env: MODEL / LLM keys, RXRESUME_*, search settings, etc.
```
4. Start:
```bash
docker compose up -d
```
5. Open the UI: `http://<VM-IP>:3005` (port mapped in `docker-compose.yml`).
6. Persist data: compose mounts `./data` — back up that directory.
---
## 2. Deploy as a container (same image, any host)
Same as the VM path: only Docker is required.
- Ensure port **3005** (or your chosen host port) is reachable if you use the UI from another machine.
- For **only** API/cron from localhost, bind to `127.0.0.1:3005` by changing the `ports:` line in `docker-compose.yml` (e.g. `"127.0.0.1:3005:3001"`).
Inside the container the app listens on **3001**; the default host map is **3005 → 3001**.
**Cron on the host** should call the API on the host:
- Browser: `http://127.0.0.1:3005`
- **API**: same origin; `/api/...` is served by the app.
If the API is only on a Docker network, use the container name and port `3001` from another container, or publish `3005` on the host and use `127.0.0.1:3005` from cron.
---
## 3. Run the pipeline on a schedule (cron)
`POST /api/pipeline/run` **starts** the pipeline in the **background** and returns quickly (`{ ok: true, data: { message: "Pipeline started" } }`). That is enough for scheduling.
Example crontab (host timezone — adjust hours):
```cron
# 08:00, 14:00, 20:00 daily
0 8,14,20 * * * /usr/local/bin/jobops-pipeline-run.sh >> /var/log/jobops-pipeline.log 2>&1
```
Create `/usr/local/bin/jobops-pipeline-run.sh`:
```bash
#!/usr/bin/env bash
set -euo pipefail
BASE_URL="${JOBOPS_URL:-http://127.0.0.1:3005}"
# If BASIC_AUTH_USER / BASIC_AUTH_PASSWORD are set in .env, uncomment:
# AUTH=(-u "${BASIC_AUTH_USER:?}:${BASIC_AUTH_PASSWORD:?}")
curl -sS -X POST "${BASE_URL}/api/pipeline/run" \
-H "Content-Type: application/json" \
-d '{}' \
"${AUTH[@]:-}" \
| tee -a /var/log/jobops-pipeline.log
echo >> /var/log/jobops-pipeline.log
```
```bash
sudo chmod +x /usr/local/bin/jobops-pipeline-run.sh
```
Set `JOBOPS_URL` in roots crontab or `/etc/environment` if the app is on another host.
**Basic auth:** When `BASIC_AUTH_USER` and `BASIC_AUTH_PASSWORD` are in `.env`, non-GET API calls need Basic auth — use `curl -u user:pass` as above.
---
## 4. Telegram notifications
The app does **not** send Telegram by itself. Practical options:
### Option A — Pipeline webhook (recommended)
1. In the app: **Settings → Webhooks** (or env `PIPELINE_WEBHOOK_URL` / `WEBHOOK_SECRET`) set a URL that receives JSON when a run **completes or fails**.
2. Point that URL to a small relay that maps the JSON to Telegram `sendMessage`.
Telegram HTTP API:
```text
https://api.telegram.org/bot<BOT_TOKEN>/sendMessage
```
Body (JSON):
```json
{
"chat_id": "<YOUR_CHAT_ID>",
"text": "Pipeline finished: ..."
}
```
Host the relay on the same VM (Flask/FastAPI/Node, or n8n). Keep **bot token** and **chat id** in environment variables.
Payload shape (sanitized) includes fields like `event`, `pipelineRunId`, `jobsDiscovered`, `jobsProcessed`, `error` — see `orchestrator/src/server/pipeline/steps/notify-webhook.ts`.
### Option B — Shipped script: run pipeline + Telegram summary (cron-friendly)
The repo includes `scripts/jobber-pipeline-telegram.sh`: it `POST`s `/api/pipeline/run`, polls `GET /api/pipeline/status` until the run ends, then sends one Telegram with **status**, **jobsDiscovered**, and **jobsProcessed** (and **errorMessage** if failed).
**1. Dependencies on the host** (LXC/VM that runs cron):
```bash
apt-get update && apt-get install -y jq curl
```
**2. Install script and secrets** (after `git pull` in `/opt/Jobber` or your clone path):
```bash
install -m 755 /opt/Jobber/scripts/jobber-pipeline-telegram.sh /usr/local/bin/jobber-pipeline-telegram.sh
cp /opt/Jobber/scripts/jobber-cron.env.example /root/.jobber-cron.env
chmod 600 /root/.jobber-cron.env
nano /root/.jobber-cron.env
```
Fill **`TELEGRAM_BOT_TOKEN`** (from @BotFather) and **`TELEGRAM_CHAT_ID`**. For a **private** chat with your bot, use `message.chat.id` from `getUpdates` (same as your Telegram user id in the JSON). **`JOBOPS_URL`** defaults to `http://127.0.0.1:3005` when Jobber runs on the same host.
**3. Manual test** (before cron):
```bash
/usr/local/bin/jobber-pipeline-telegram.sh
```
You should get one Telegram when the pipeline finishes. Optional log: append `>> /var/log/jobber-pipeline.log 2>&1` on the cron line.
**4. Cron** (example: 08:00, 14:00, 20:00 host local time — `crontab -e`):
```cron
0 8,14,20 * * * /usr/local/bin/jobber-pipeline-telegram.sh >> /var/log/jobber-pipeline.log 2>&1
```
**Security:** Never commit `/root/.jobber-cron.env` or paste bot tokens in Git. Revoke the token in BotFather if it was exposed.
### Option B2 — Minimal curl-only (no wait-for-finish)
If you only want to **trigger** the pipeline from cron without this script, use §3. For a one-off Telegram without polling:
```bash
TELEGRAM_BOT_TOKEN="123456:ABC..."
CHAT_ID="your_numeric_chat_id"
MSG="$(printf 'Pipeline finished. Check dashboard.')"
curl -sS -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
-H "Content-Type: application/json" \
-d "{\"chat_id\":\"${CHAT_ID}\",\"text\":$(echo "$MSG" | jq -Rs .)}"
```
**chat_id:** Message your bot, then open `https://api.telegram.org/bot<TOKEN>/getUpdates` and read `message.chat.id` (if `result` is empty, send **Start** to the bot first, or call `deleteWebhook` if a webhook was set).
### Option C — External automation
Use n8n, Grafana OnCall, or similar: schedule → `POST /api/pipeline/run` → wait/poll → Telegram node.
---
## 5. Security notes
- Do not commit `.env` or Telegram tokens.
- Prefer Basic Auth if the instance is reachable from the internet.
- Restrict firewall so only your IP (or VPN) can reach the published port if exposed.
---
## 6. Git remotes (reference)
```bash
git remote -v
git push origin main # or: git push gitea main — whatever you configured
```
---
## Related
- Env knobs: `PIPELINE_WEBHOOK_URL`, `WEBHOOK_SECRET`, `BASIC_AUTH_USER`, `BASIC_AUTH_PASSWORD` in `.env.example`.
- Local docs: `npm run docs:dev` from the repo root.
---
## 7. Proxmox: VM vs LXC, sizing, fast setup
### VM or container (LXC)?
| | **QEMU VM (recommended)** | **LXC** |
|---|---------------------------|--------|
| **Docker** | Works the same as on any Linux server. | Possible with `nesting=1` (and sometimes `keyctl`); more Proxmox/Docker footguns. |
| **This app** | Playwright/Firefox + Node inside Docker — predictable. | Same stack *can* work in nested Docker, but troubleshooting is harder. |
| **Overhead** | Slightly higher RAM for a full kernel. | Lower overhead per CT. |
**Choose a VM** unless you already run Docker in LXC on this cluster and know the knobs. For speed and fewer surprises: **Ubuntu 24.04 LTS cloud image**, 24 vCPU, 48 GB RAM, **≥ 40 GB** disk (Docker layers + `./data`).
**Rough sizing**
- **Light personal use:** 2 vCPU, **4 GB RAM**, 40 GB disk — often enough.
- **Comfortable (pipelines + browsers + headroom):** 4 vCPU, **68 GB RAM**, 64 GB disk.
- **Tight:** 2 GB RAM can work for idle UI only; **scraping/LLM runs will swap or OOM** — avoid.
### Proxmox UI (once per guest)
1. **Create VM** → ISO or cloud-init image (e.g. Ubuntu 24.04).
2. **Network**: bridge (e.g. `vmbr0`) so the guest gets a LAN IP.
3. **Disk**: virtio, discard on if SSD.
4. **CPU type:** `host` if single-node and you want a tiny perf edge; `kvm64` is fine.
5. After install: **Guest agent** optional but handy for IP in Proxmox UI.
### One-shot shell setup (inside the Ubuntu VM)
Run as a user with `sudo`. Set `REPO_URL` to your Git remote (HTTPS or SSH). First build can take several minutes.
```bash
set -euo pipefail
REPO_URL="${REPO_URL:-https://github.com/YOUR_USER/Jobber.git}" # change
APP_DIR="${APP_DIR:-$HOME/Jobber}"
sudo apt-get update
sudo apt-get install -y ca-certificates curl git
# Docker Engine + Compose plugin (official convenience script; review if you prefer manual repo install)
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker "$USER"
# Log out and back in so `docker` works without sudo, or use `newgrp docker` for this session:
newgrp docker || true
git clone "$REPO_URL" "$APP_DIR"
cd "$APP_DIR"
cp .env.example .env
echo "Edit .env now (LLM keys, RXRESUME, etc.), then run: docker compose up -d --build"
```
Then edit `.env`, then:
```bash
cd "$APP_DIR"
docker compose up -d --build
```
Open `http://<VM-IP>:3005`. Persist backups of `$APP_DIR/data` and your `.env`.