Inventory and Caddy playbook for levkin LXC 220; Makefile target caddy-levkin. Document git-ci-01 disk (64G), capacity 2, prune cron, and pve201 RAM limits in host_vars and homelab guides. Co-authored-by: Cursor <cursoragent@cursor.com>
22 KiB
Levkin self-hosted stack — plan & decisions
Reference doc for the Proxmox homelab. Lives alongside the Cursor project that has the Proxmox info.
Conventions:
- All groups run inside an LXC unless marked VM.
- Inside each LXC: one
docker-compose.yml, managed by Dockge where applicable. - Caddy on the
edgeLXC is the only thing exposed to the internet. - Authentik on the
identityLXC is the source of truth for who you are. - Vaultwarden stays standalone (it's the break-glass path if Authentik dies).
Progress summary (updated 2026-05-23)
| Area | Status |
|---|---|
| Phase 0 Foundation | ✅ Mostly done — static IPs on pve10 LXCs; Caddy still on VM 106 |
| Phase 1 Identity (Authentik) | ✅ LXC 217 @ 10.0.10.21 |
| Phase 2 Monitoring (Kuma, Dockge, Umami) | ✅ LXC 218 @ 10.0.10.22 |
| Phase 3 Cal.com | ✅ LXC 210 — OIDC + auto site button still open |
| Phase 4 SSO migration | ⏳ Not started (Cal → Authentik first) |
| Phase 5–8 Immich, Crater, Outline, etc. | ⏳ Deferred |
| Site consolidation | ⏳ Partial — levkin.ca on LXC 220 @ 10.0.10.60 ✅; caseware/auto/portfolio on 215/216/219 (site-lxc-git.md); moving all static to Caddy VM is optional later |
| dev-apps (punim/pote/mirrormatch) | ⏳ Not started — punimTag 9101 still on pve201 (active testing; do not migrate yet) |
| Nextcloud retire | ⏳ VM 201 is running again on pve10 — finish decommission |
| Portainer retire | ⏳ VM 109 still running (16 GB maxmem) on pve10 — stop after Dockge confirmed |
Capacity headroom (live check 2026-05-23)
Use this before adding LXCs/VMs. Re-check with pvesm status and free -h on each node.
pve10 (PVENAS) — primary place for new homelab services
| Resource | Total | Used | Available | Notes |
|---|---|---|---|---|
| local-lvm (thin) | ~1.67 TiB | ~22% | ~1.30 TiB | Plenty of disk for new LXCs |
| RAM (host) | 62 GiB | ~44 GiB | ~17 GiB | Enough for 2–3 small LXCs (2 GB each) as-is |
Realistic new capacity on pve10 (without stopping anything): ~4–6 GiB RAM + 100–200 GiB disk for one productivity/media LXC (Outline, Mealie, Immich-lite).
If you free RAM first (recommended):
| Stop / retire | Frees (maxmem) |
|---|---|
| Portainer VM 109 | 16 GiB |
| Nextcloud VM 201 | 8 GiB |
| Hermes VM 117 (if not needed) | 16 GiB |
| Site LXCs 215/216 → Caddy static (future) | ~1 GiB |
After Portainer + Nextcloud off: ~41 GiB effective headroom on pve10 — room for Immich, Crater, Beszel, or a dev-apps LXC (6–8 GiB).
pve201 (pve) — do not add new services
| Resource | Total | Used | Available | Notes |
|---|---|---|---|---|
| local-lvm | ~1.67 TiB | ~46% | ~922 GiB | Disk OK |
| RAM | 125 GiB | ~122 GiB | ~3 GiB | Saturated; GPU VM 104 (73 GB), punimTag 9101 (16 GB) |
Verdict: New stacks belong on pve10. pve201 only benefits from stopping/migrating guests (punim after testing, GPU resize, old Kuma already stopped).
Current state (May 2026)
Already running:
- Caddy reverse proxy — currently on a VM (should migrate to LXC, see "Caddy migration" section)
- Mailcow — VM, mail domain is
levkine.ca(with e) - Vaultwarden, Vikunja, n8n, Listmonk, Mattermost, Nextcloud — across various LXCs
- Cal.com — LXC id
210,cal.levkin.ca, Postgres included, admin userilia, 15-min consult event live atcal.levkin.ca/ilia/consultwith Jitsi link - Caddy entries live for:
levkin.ca,caseware.levkin.ca,auto.levkin.ca,iliadobkin.com,cal.levkin.ca,listmonk.levkin.ca,pdf.levkin.ca,search.levkin.ca,auth.levkin.ca,stats.levkin.ca - Authentik — LXC 217 @
10.0.10.21,https://auth.levkin.ca, admin + TOTP enrolled - Monitoring — LXC 218 @
10.0.10.22: Uptime Kuma:3001, Dockge:5001, Umami:3000(LAN-only) — monitoring-stack.md - Umami + Authentik admin/TOTP/backup codes — done
- Uptime Kuma — monitors live; email alerts via Mailcow — see monitoring-stack.md
- Dockge on 218 — manages local
/opt/monitoringstack - Snapshots
backup-20260522on LXCs 217, 218 - Jellyfin (VM 101) — stopped
- LXC 210, 215–218, 219 — static via
pct set; Caddy VM 106 — static in-guest.50 - Nextcloud VM 201 — export done; VM still running on pve10 — retire next (8 GB RAM reclaimed)
- Portainer VM 109 — still running on pve10 (16 GB) — retire; Dockge on 218 replaces it
- Marketing sites — LXC 220 (
levkin.ca), 215/216/219 (git deploy), not yet on Caddy VM static roots - punimTag dev — pve201 LXC 9101 @
10.0.10.121(16 GB) — leave until testing done; thendev-appson pve10
Decisions locked in:
- Container manager: Dockge (not Portainer, not Coolify/Dokploy/CapRover)
- Chat: Mattermost only — no Matrix/Synapse
- Knowledge tool: Outline for client-facing, SiYuan if/when PhD work picks up (don't run Affine + Trilium too)
- Bookmark manager: Linkwarden (full-page archive is the killer feature)
- Authentik is the SSO target; Vaultwarden stays standalone
LXC / VM grouping table
| Group | What's inside | Why grouped | LXC or VM |
|---|---|---|---|
| edge | Caddy reverse proxy, Crowdsec/Fail2ban | The front door — small, stable, restart rarely | LXC, 1 vCPU, 1GB RAM |
| identity | Authentik (+ Postgres + Redis), Vaultwarden | Auth-critical — touch rarely, back up religiously | LXC, 2 vCPU, 2GB RAM |
| comms | Mailcow | Mailcow's compose is huge (15+ containers) and self-contained — wants its own host | VM, 4GB RAM |
| automation | n8n, Windmill (later), Huginn (later) | Active workloads, frequent updates, you'll touch these a lot | LXC, 2–4 vCPU, 4GB RAM |
| productivity | Vikunja, Listmonk, Outline, Mealie, Linkwarden | Personal/team productivity, low-resource | LXC, 2 vCPU, 4GB RAM |
| media | Immich, Nextcloud, Paperless-ngx | Large storage, GPU passthrough useful for Immich ML | VM if GPU passthrough, else LXC. Lots of disk. |
| business | Cal.com ✅, Crater | Client-facing, financial — back up often | LXC, 2 vCPU, 2GB RAM |
| monitoring | Uptime Kuma ✅, Dockge ✅, Umami ✅, Beszel (later) | Ops stack on LXC 218 | LXC, 2 vCPU, 2GB RAM |
| labs | Anything experimental — Flowise, Trigger.dev | Things you're trying out, can be wiped | LXC, scratch space |
Why this grouping (cheat sheet)
- One service goes bad → only its group restarts.
- Need a kernel upgrade for one stack → snapshot the LXC, upgrade, roll back if broken.
- Mailcow's huge surface area is isolated in its own VM.
- Edge LXC is tiny and stable → perfect for the layer everything depends on.
- Backup cadence per group (see Backups section).
- Resource limits per LXC mean a runaway container can't eat n8n's RAM.
Subdomains
Only expose what actually needs to be public. Internal services use Tailscale/Wireguard for remote access.
Expose publicly
| Subdomain | Service | Group | Why public | Status |
|---|---|---|---|---|
levkin.ca |
Company site (spec + /folders) |
edge | Main brand | ✅ LXC 220 — DNS must point to home IP (was parked elsewhere) |
caseware.levkin.ca |
Static site | edge | Marketing | ✅ live |
auto.levkin.ca |
Static site | edge | Marketing | ✅ live |
iliadobkin.com |
Portfolio (SDET) | edge | Personal site | ✅ live (pve10 LXC 219) |
cal.levkin.ca |
Cal.com | business | Clients book on it | ✅ live |
listmonk.levkin.ca |
Listmonk | productivity | Unsubscribe URLs must resolve | ✅ live |
mail.levkine.ca |
Mailcow | comms | Mail server | ✅ live |
auth.levkin.ca |
Authentik | identity | OIDC redirect URLs need external resolution | ✅ live |
bill.levkin.ca |
Crater | business | Clients view invoices | ⏳ Phase 6 |
cloud.levkin.ca |
Nextcloud | media | Retiring — decommission VM 201 after cutover | 🗑️ |
photos.levkin.ca |
Immich | media | Mobile apps need public hostname | ⏳ Phase 5 |
vault.levkin.ca |
Vaultwarden | identity | Mobile clients need public hostname | ⏳ |
notes.levkin.ca |
Outline | productivity | Sharing docs with clients | ⏳ |
chat.levkin.ca |
Mattermost | comms | Only if inviting outside users | ⏳ optional |
Keep internal only (no public DNS, no Caddy block)
Reachable only via local network or Tailscale/Wireguard:
| Service | Reason |
|---|---|
| Umami admin UI | Only you need the dashboard. Tracking endpoint can be public, dashboard isn't. |
| Uptime Kuma | Status dashboard is for you. Don't advertise infrastructure. |
| Beszel | Metrics are admin-only. |
| Dockge | Admin UI — local only. |
| n8n editor | UI shouldn't be exposed. Webhooks go on hooks.levkin.ca if needed. |
| Huginn / Windmill / Flowise | Admin tools. |
| Vikunja | Personal task manager. |
| Mealie | Family recipes. |
| Trigger.dev | Internal automation. |
| Paperless-ngx | Personal documents. Never expose. |
| SiYuan | Personal knowledge. |
| Linkwarden | Personal bookmarks. |
Borderline (decide per service)
| Subdomain | Service | Notes |
|---|---|---|
stats.levkin.ca |
Umami collector | Only the tracking script endpoint needs to be public; admin UI stays internal |
status.levkin.ca |
Uptime Kuma | Kuma supports a separate public status page URL — that one can be public |
Phased rollout
Phase 0 — Foundation
- ✅ Caddy running (on VM — migrate to LXC in Phase 1.5)
- ✅ Static IP audit (partial) — all LXCs on pve10 pinned; Caddy VM static
.50; remaining VMs on stable DHCP — see host-list.md - ✅ DNS for
auth.levkin.ca→ home IP (verified 2026-05-22) - ✅
identityLXC 217 @10.0.10.21(2 vCPU, 2GB RAM, 20GBlocal-lvm, Debian 12 + Docker Compose)
Phase 1 — Identity ✅
- ✅ Deploy Authentik in
identityLXC (Authentik + Postgres + Redis, official compose at/opt/authentik) - ✅ Caddy:
auth.levkin.ca→10.0.10.21:9000(simple passthrough, no forward-auth) - ✅ Admin user (
admin), TOTP enrolled - ✅
authentik Adminsgroup (skip customusersgroup until more accounts) - ✅ Static backup codes; don't OIDC other apps until Cal.com test
Phase 1.5 — Caddy migration to LXC (~30 min)
Why now (after Phase 1, before bulk SSO work in Phase 4): Authentik is stable enough to absorb a small change, but you haven't yet built the dependency web of OIDC integrations that would make a Caddy reload risky.
Why Caddy belongs in an LXC, not a VM:
- ~50MB OS overhead vs ~512MB for a VM
- Boot/restart in 2-5s vs 20-40s (matters when reloading config)
- Snapshot/backup is faster
- Caddy is a Go binary doing reverse-proxy work — no need for kernel isolation
- Near-native network performance
Steps:
- Create
edgeLXC: Debian 12, 1 vCPU, 512MB RAM, 8GB disk, static IP from host list - Install Caddy via official Debian repo:
apt install -y debian-keyring debian-archive-keyring apt-transport-https curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | tee /etc/apt/sources.list.d/caddy-stable.list apt update && apt install caddy - Copy
Caddyfile+ custom snippets ((security-headers)etc.) from the VM - Add a test subdomain (e.g.
test.levkin.ca) pointing at the new LXC — verify TLS issues and routing works - Cut over: update router port-forward (80/443) to the new LXC IP. DNS A records don't need to change if they point to your home IP.
- Watch Mailcow, Cal.com, Listmonk, the marketing sites for ~24h
- Keep the old VM snapshot for a week, then delete
Phase 2 — Quick wins ✅
- ✅ Umami — tracking on levkin.ca, caseware, auto, and iliadobkin.com (portfolio)
- ✅ Uptime Kuma — monitors in UI
- ✅ Dockge — logged in; register
/opt/monitoringstack (see monitoring-stack.md) - ✅ Kuma email alerts — SMTP via Mailcow (see homelab-status-2026-05-22.md)
Phase 3 — Cal.com (mostly done) ✅
- ✅ Cal.com deployed in
businessLXC (id 210, Postgres included) - ✅
cal.levkin.caproxied via Caddy - ✅ Booking link live at
cal.levkin.ca/ilia/consultwith Jitsi location - ✅ Email working via
cal@levkine.caSMTP through Mailcow - ⏳ Wire Cal.com to Authentik via OIDC (first real SSO connection — do this after Phase 1)
- ⏳ Update
auto.levkin.cabutton →cal.levkin.ca/ilia/consult(currently points to placeholder)
Phase 4 — SSO migration (~half a day, staged)
Wire each to Authentik, least-risky first:
- Vikunja (OIDC native) — easy, single-user impact
Nextcloud— skipped (VM 201 retiring)- Listmonk (OIDC native, admin only) — easy
- Mattermost (SAML or OIDC native) — moderate
- Mailcow (OIDC) — last, because mail-critical
For each: keep a local admin password as a break-glass account.
Phase 5 — Family / personal wins (~1 evening)
- Immich in
mediaVM — install mobile apps for you and family, enable auto-upload. Face recognition runs in background; "my kids 2024" works within a couple days. - Skip PhotoPrism — Immich covers it.
Phase 6 — Business / consulting (~1–2 evenings)
- Crater in
businessLXC — tax rates, company info, Stripe integration if you want online payment - Beszel hub in
monitoringLXC + agents on each LXC — one dashboard for resource usage
Phase 7 — Automation depth (ongoing)
Only when you have a real use case:
- Huginn in
automation— first agent: competitor pages, kosher product availability, grant deadlines - Windmill in
automation— first script: rewrite an n8n flow with too many code nodes - Flowise in
labs— first flow: chat-with-docs against your consulting notes
Phase 8 — Knowledge / research
- Outline in
productivityLXC — client-facing wiki + your notes - Linkwarden in
productivityLXC — bookmarks with full-page archive - Paperless-ngx in
media— scan and OCR the paper that's accumulating - SiYuan — only if/when PhD or long-form research becomes relevant
Static IP audit
Maintain a host-list.md file (in this Cursor project, alongside this plan) with every LXC/VM, its current IP, its target static IP, and DHCP/static status. Cursor will use this as the source of truth when scripting changes.
Suggested format:
| LXC/VM ID | Name | Role | Current IP | Target static IP | DHCP/Static | Notes |
|---|---|---|---|---|---|---|
| 210 | cal | Cal.com | 10.0.10.228/24 (DHCP) | 10.0.10.228/24 | ⏳ static | Convert ASAP |
| ... | ... | ... | ... | ... | ... | ... |
Recommended IP plan
Use /24 subnets within 10.0.10.0/24 (or whatever your LAN is) with role-based ranges so it's scannable:
| Range | Reserved for |
|---|---|
.1 - .9 |
Network gear (router, switches, APs) |
.10 - .19 |
Proxmox host(s) + PBS |
.20 - .39 |
Edge / identity / comms (critical infra) |
.40 - .79 |
Application LXCs (productivity, automation, business, monitoring) |
.80 - .99 |
Media VM(s) |
.100 - .199 |
DHCP pool (clients, phones, laptops) |
.200 - .249 |
Labs / experimental |
.250 - .254 |
Reserved |
How to set static on a Proxmox LXC
Two methods — pick one and stick with it:
Method A — Proxmox CLI (recommended, survives reboots cleanly):
pct set <ID> -net0 name=eth0,bridge=vmbr0,ip=10.0.10.X/24,gw=10.0.10.1
pct reboot <ID>
Method B — Router DHCP reservation:
- Reserve the IP in your router's DHCP table by MAC address. LXC stays "DHCP" technically, but always gets the same IP.
- Easier if you have many hosts and one router.
- Risk: if the LXC's MAC changes (rebuild from snapshot to new ID), reservation breaks.
Recommendation: Method A (pct set) for everything critical (edge, identity, comms, business). Method B is fine for labs/experimental LXCs.
Audit checklist
- List every LXC:
pct list - List every VM:
qm list - For each, run
pct exec <ID> -- ip a(orqm guest exec <ID> -- ip afor VMs) and check whether the IP came from DHCP - Fill in
host-list.md - Pick target IPs from the range plan above
- Convert one at a time, lowest-risk first (labs → productivity → business → comms → identity → edge)
- After each conversion, verify the Caddy reverse-proxy entry still works (curl from outside)
- Update
host-list.mdstatus column
Hosts known to need conversion right now
LXC 210 (cal)— static at10.0.10.228✅- Site LXCs 220, 215/216/219 — static; served via Caddy → nginx on each LXC (git deploy). Optional future: static files on Caddy VM only.
Backlog (priority order)
P0 — next (Phase 1–2 largely ✅)
Umami✅Uptime Kuma✅Dockge✅- Cal.com → Authentik OIDC — first SSO
- Retire Nextcloud VM 201 + Portainer VM 109 — frees ~24 GiB on pve10
- Beszel — fits on monitoring LXC 218 or small agent LXCs
- Mealie — new small LXC on pve10 (~2 GB)
P1 — when ready
- Outline — wiki for client docs
- Linkwarden — bookmarks with full-page archive
- Plane — Jira-lite project management (pair with Mattermost)
P2 — when you have a real need
- Crater — invoicing (Phase 6)
- Immich — photos (Phase 5)
- Paperless-ngx — document scanning (Phase 8)
- Huginn — first when you have a monitoring use case
- Windmill — when n8n hits limits
- Trigger.dev — durable background jobs in code (better fit than Windmill for QA work)
- PrivateBin — encrypted paste for sharing secrets with contractors
- Addy.io — email aliases
- SiYuan — if PhD work picks up
- Flowise — labs only, when LLM workflow use case appears
Skip / declined
PhotoPrism— Immich covers itActivepieces— you already have n8nAffine / Trilium— picked Outline + SiYuan insteadMatrix/Synapse + Element— staying on MattermostCoolify / Dokploy / CapRover— Dockge is enough; revisit only if writing many custom apps
Backup strategy
- Proxmox Backup Server (PBS) or
vzdumpto a NAS — snapshot each LXC/VM nightly - Critical groups (
identity,comms,business): 7 daily + 4 weekly + 12 monthly - Productivity/automation: 7 daily + 4 weekly
- Labs: 3 daily, no long retention
- Off-site copy of
identityandbusinessLXCs — these contain auth and billing data. Encrypted copy to Wasabi or Backblaze B2.
The whole LXC gets snapshotted — much simpler than file-level container backup.
Done on pve10 (2026-05-22): pct snapshot backup-20260522 on LXCs 217 (identity) and 218 (monitoring).
Next steps (priority order)
See homelab-status-2026-05-22.md for automation checklist.
| # | Task | Status | Effort | Frees / unlocks |
|---|---|---|---|---|
| 1 | Kuma SMTP | ✅ done | — | — |
| 2 | Cal.com → Authentik OIDC | ⏳ next | 1–2 h | First SSO; test before Vikunja/Listmonk |
| 3 | auto.levkin.ca → Cal booking link | ⏳ | 15 min | Phase 3 item 6 |
| 4 | Stop Portainer VM 109 | ⏳ | 10 min | ~16 GiB RAM on pve10 |
| 5 | Retire Nextcloud VM 201 | ⏳ | 30 min | ~8 GiB RAM; remove Caddy + Kuma monitor |
| 6 | UniFi DHCP reservations | ⏳ | 20 min | unifi-static-dhcp.md |
| 7 | Beszel on 218 or agents | ⏳ | 1 h | Capacity visibility before Immich |
| 8 | NAS.SP00 disk → Jellyfin | ⏳ hardware | — | VM 101 |
| 9 | Caddy → edge LXC .20 |
⏳ defer | ~30 min | Phase 1.5 |
| 10 | dev-apps LXC (pote, mirrormatch, then punim) | ⏳ defer | half day | pve201 RAM; punim last |
| 11 | Static sites → Caddy VM (optional) | ⏳ defer | 1 h | ~1 GiB; breaks git-on-LXC workflow unless you move deploy to Caddy |
Defer: Immich, Crater, Outline, Plane, SSO for Vikunja/Listmonk/Mailcow until rows 2–5 done.
Adding a new service — quick rule
| Want to add… | Node | RAM budget | Prerequisite |
|---|---|---|---|
| Small app (Mealie, Linkwarden) | pve10 | 2 GB LXC | Stop 109 and/or 201 first if host feels tight |
| Medium (Outline, Crater) | pve10 | 4 GB LXC | Free ~24 GiB via Portainer + Nextcloud retire |
| Heavy (Immich + ML) | pve10 or pve201 GPU | 4–8 GB+ | NAS healthy; pve201 only after GPU/punim sized down |
| Dev sandbox | pve10 dev-apps |
6–8 GB | punim 9101 migration only after testing |
Nextcloud decommission (VM 201)
- Confirm export in
exports/nextcloud-2026-05-21/is complete - Delete Nextcloud monitor in Kuma
- Remove
nextcloud.levkin.cafrom Caddy VM - Stop VM 201; update host-list.md
- After NAS healthy: optional
vzdumparchive then delete disk
Important rules
- Never put Authentik behind itself.
auth.levkin.cais a simple Caddy passthrough — no forward-auth, no fancy dependencies. If Authentik goes down, you'd lose access to Authentik. - Vaultwarden stays standalone. It's your break-glass path if Authentik dies. Don't OIDC it.
- Keep a local admin password on every SSO-wired app. OIDC integrations break during upgrades — you need to log in to fix them.
- Local admin to Proxmox host. Independent of Authentik and Vaultwarden. Written down somewhere physical.
- Don't expose admin UIs publicly. Dockge, Beszel, Uptime Kuma admin, n8n editor — use Tailscale or Wireguard for remote access.
- Static IPs for every LXC. DHCP will eventually move them and Caddy will break. Set via
pct set <id> -net0 ...ip=10.0.10.X/24,gw=...or a router reservation. - Cal.com LXC (210) — static at
.228✅. - Maintain
host-list.mdas the single source of truth for IPs. Update it whenever a new LXC/VM is created or migrated.