ansible/docs/guides/monitoring-stack.md
ilia de49b34cdc
Some checks failed
CI / skip-ci-check (pull_request) Successful in 6s
CI / lint-and-test (pull_request) Failing after 9s
CI / ansible-validation (pull_request) Failing after 6s
CI / secret-scanning (pull_request) Successful in 5s
CI / dependency-scan (pull_request) Successful in 8s
CI / sast-scan (pull_request) Failing after 5s
CI / license-check (pull_request) Successful in 11s
CI / vault-check (pull_request) Failing after 6s
CI / playbook-test (pull_request) Failing after 6s
CI / container-scan (pull_request) Failing after 6s
CI / sonar-analysis (pull_request) Failing after 2s
CI / workflow-summary (pull_request) Successful in 4s
Add homelab monitoring, portfolio site, and vault tooling.
Document pve10 static IPs, monitoring stack, and site LXCs; add portfolio
to inventory; Mailcow mailbox automation; vault import/export scripts;
security audit guides and UniFi DHCP reference.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-22 16:25:07 -04:00

233 lines
7.5 KiB
Markdown

# Monitoring stack (LXC 218)
**Host:** `monitoring` @ `10.0.10.22` (PVENAS pve10, VMID **218**)
**Compose:** `/opt/monitoring/compose.yml`
**Stacks dir (Dockge):** `/opt/stacks`
All admin UIs are **LAN-only** (no public Caddy blocks). Use Tailscale or local network.
| Service | URL | Port | Notes |
|---------|-----|------|-------|
| **Uptime Kuma** | http://10.0.10.22:3001 | 3001 | Admin + monitors configured ✅ (replaces pve201 LXC **305** @ `.197`, stopped) |
| **Dockge** | http://10.0.10.22:5001 | 5001 | Manage compose on **this LXC only** |
| **Umami** | http://10.0.10.22:3000 | 3000 | Password changed ✅; caseware + auto tracked |
Secrets: `/opt/monitoring/.env` on the LXC (mode 600). Not in git.
---
## Backups (pve10)
| Guest | VMID | Snapshot | Date |
|-------|------|----------|------|
| identity | 217 | `backup-20260522` | 2026-05-22 |
| monitoring | 218 | `backup-20260522` | 2026-05-22 |
On pve10:
```bash
pct listsnapshot 217
pct listsnapshot 218
# Rollback if needed:
# pct rollback 217 backup-20260522
```
Optional off-node copy (when NAS healthy): `vzdump 217 218 --storage local --mode snapshot --compress zstd`
---
## Uptime Kuma — monitors
Configured in UI (all green). **Remove** the Nextcloud monitor when VM 201 is retired.
| Name | URL |
|------|-----|
| Authentik | https://auth.levkin.ca |
| Cal.com | https://cal.levkin.ca |
| Caseware / Auto | marketing sites |
| Mailcow | https://mail.levkine.ca |
| Listmonk, Gitea, Vault, Todo, PVE nodes | per your dashboard |
---
## Uptime Kuma — email alerts (Mailcow)
Mail domain is **`levkine.ca`** (with **e**). Cal.com already sends via Mailcow as `cal@levkine.ca`.
### Which email to use
| Role | Address | Notes |
|------|---------|-------|
| **SMTP server** | `mail.levkine.ca` | Mailcow host |
| **SMTP port** | `587` | STARTTLS (not 465 unless you prefer SMTPS) |
| **From (sender)** | `alerts@levkine.ca` | Create mailbox in Mailcow if it does not exist |
| **To (you)** | `idobkin@gmail.com` or `ilia@levkine.ca` | Use whichever you read; Gmail is fine for alerts |
### 1. Create mailbox in Mailcow (if needed)
**Automated (needs Mailcow API key):**
```bash
# Define mailbox in group_vars/all/mailcow.yml, password in vault:
make mailcow-mailbox MAILBOX=alerts
# (alias: make mailcow-create-alerts)
# Import from .env into vault once, then delete .env:
cp .env.example .env # MAILCOW_API_KEY=... ALERTS_PASSWORD=...
make vault-import-env
make mailcow-mailbox MAILBOX=alerts
```
To add another mailbox tomorrow: edit `mailcow.yml` + `vault_mailcow_mailbox_passwords.<name>`, then `make mailcow-mailbox MAILBOX=<name>`.
**Manual UI:**
1. https://mail.levkine.ca → admin login
2. **Email → Mailboxes → Add**`alerts@levkine.ca` (strong password → store in Vaultwarden)
3. Optional: alias `monitoring@levkine.ca` → same inbox
### 2. Add notification in Kuma
**Automated (from your Mac, after mailbox exists):**
```bash
cd /path/to/ansible
pip install uptime-kuma-api # or: .venv/bin/pip install uptime-kuma-api
export KUMA_URL=http://10.0.10.22:3001 KUMA_USER=admin KUMA_PASSWORD='...'
export SMTP_USER=alerts@levkine.ca SMTP_PASS='...' SMTP_TO=idobkin@gmail.com
./scripts/kuma-setup-smtp.sh
```
**Manual UI:**
1. http://10.0.10.22:3001 → **Settings****Notifications****Setup Notification**
2. Type: **Email (SMTP)**
3. Fill in:
| Field | Value |
|-------|--------|
| SMTP Host | `mail.levkine.ca` |
| SMTP Port | `587` |
| Security | TLS / STARTTLS |
| Username | `alerts@levkine.ca` |
| Password | mailbox password |
| From Email | `alerts@levkine.ca` |
| To Email | `idobkin@gmail.com` (or your `@levkine.ca`) |
4. **Test** → save
5. Edit each monitor (or default) → **Notifications** → enable this channel
**Alternative:** Mattermost webhook (`slack.levkin.ca`) if you prefer chat over email.
---
## Dockge — what to do after login
**On server today:**
| Path | Contents |
|------|----------|
| `/opt/monitoring/compose.yml` | **Live** stack (Docker project `monitoring`, 4 containers running) |
| `/opt/stacks/monitoring/compose.yaml` | Copy for Dockge (same services) |
| `/opt/stacks/authentik-ref/`, `cal-ref/` | README only — **no** compose file (ignore) |
**Why “Scan Stacks Folder” looks empty**
- Scan only picks up folders under **`/opt/stacks`** that contain `compose.yaml` / `compose.yml`.
- Your containers were started from **`/opt/monitoring`**, so Docker does not automatically link them to `/opt/stacks/monitoring` until you register that folder in Dockge.
**Fix (pick one):**
### Dockge UI note (your version)
**Settings → General** only has hostname — there is **no “Stacks directory” field**. That path is fixed at deploy time:
`DOCKGE_STACKS_DIR=/opt/stacks` (already set in `/opt/monitoring/compose.yml`).
Stacks are managed from the **home / dashboard** page, not Settings.
### Option 1 — Add stack manually (recommended)
1. http://10.0.10.22:5001 → **home** (logo / dashboard, not Settings)
2. **+ Create Stack** (or **Compose** → new stack)
3. Name: `monitoring`
4. Path: `/opt/stacks/monitoring` (must contain `compose.yaml`)
5. Open stack → review compose → **do not Start** until old project is stopped (below)
### Option 2 — Scan from dashboard menu
1. Stay on **dashboard** (not Settings)
2. Top-right **⋮** → **Scan Stacks Folder**
3. Pick **`monitoring`** if it appears (`authentik-ref` / `cal-ref` have no compose — ignore)
**Avoid duplicate containers**
Before starting from Dockge:
```bash
ssh root@10.0.10.22
cd /opt/monitoring && docker compose down
# Then start from Dockge UI on stack monitoring, OR:
cd /opt/stacks/monitoring && docker compose --env-file .env up -d
```
Until you do that, Kuma/Dockge/Umami keep running from `/opt/monitoring`; Dockge is optional for edits until cutover.
### Optional reference stacks (read-only)
Create empty stacks under `/opt/stacks/` only if you want a UI placeholder:
```bash
ssh root@10.0.10.22
mkdir -p /opt/stacks/authentik /opt/stacks/cal
# Copy compose for reference (does NOT control remote host):
scp root@10.0.10.21:/opt/authentik/compose.yml /opt/stacks/authentik/
```
To **manage** Authentik or Cal from Dockge long term, either move compose to 218 (not recommended) or install Dockge on each LXC later.
### Step 3 — Retire Portainer
When comfortable: stop VM **109** (portainer) on pve10; use Dockge on 218 instead.
---
## Umami
- ✅ Running at http://10.0.10.22:3000 (LAN / Tailscale only)
-**Public tracking** via `https://stats.levkin.ca/script.js` on caseware, auto, and **iliadobkin.com** (portfolio LXC 219)
**Three choices (pick one later; none block the sites):**
| Option | Effort | Notes |
|--------|--------|--------|
| **A — Skip public analytics** | 0 | Use Umami dashboard on `:3000` when you care; no DNS/Caddy |
| **B — One DNS + Caddy block** | ~10 min | A record → home IP + Caddy `reverse_proxy 10.0.10.22:3000` on caddy VM |
| **C — Re-add script tags** | 2 min | After B works, insert script before `</head>` on 215/216 |
**Suggested public hostname (instead of `analytics`):** `stats.levkin.ca` (short, clear). Alternatives: `umami.levkin.ca`, `metrics.levkin.ca`.
```caddy
stats.levkin.ca {
import security-headers
encode gzip
reverse_proxy 10.0.10.22:3000
}
```
Script tag then: `https://stats.levkin.ca/script.js`
We are **not stuck** — marketing sites do not need Umami to render. Option A is fine for now.
---
## Maintenance
```bash
ssh root@10.0.10.22
cd /opt/monitoring
docker compose --env-file .env pull
docker compose --env-file .env up -d
docker compose ps
```