## Summary This PR adds comprehensive support for deploying the **POTE** application project via Ansible, along with improvements to IP conflict detection and a new app stack provisioning system for Proxmox-managed LXC containers. ## Key Features ### 🆕 New Roles - **`roles/pote`**: Python/venv deployment role for POTE (PostgreSQL, cron jobs, Alembic migrations) - **`roles/app_setup`**: Generic app deployment role (Node.js/systemd) - **`roles/base_os`**: Base OS hardening role ### 🛡️ Safety Improvements - IP uniqueness validation within projects - Proxmox-side IP conflict detection - Enhanced error messages for IP conflicts ### 📦 New Playbooks - `playbooks/app/site.yml`: End-to-end app stack deployment - `playbooks/app/provision_vms.yml`: Proxmox guest provisioning - `playbooks/app/configure_app.yml`: OS + application configuration ## Security - ✅ All secrets stored in encrypted vault.yml - ✅ Deploy keys excluded via .gitignore - ✅ No plaintext secrets committed ## Testing - ✅ POTE successfully deployed to dev/qa/prod environments - ✅ All components validated (Git, PostgreSQL, cron, migrations) Co-authored-by: ilia <ilia@levkin.ca> Reviewed-on: #3
174 lines
6.0 KiB
Markdown
174 lines
6.0 KiB
Markdown
# App stack execution flow (what happens when you run it)
|
||
|
||
This document describes **exactly** what Ansible runs and what it changes when you execute the Proxmox app stack playbooks.
|
||
|
||
## Entry points
|
||
|
||
- Recommended end-to-end run:
|
||
- `playbooks/app/site.yml`
|
||
- Repo-root wrappers (equivalent):
|
||
- `site.yml` (imports `playbooks/site.yml`, and you can `--tags app`)
|
||
- `provision_vms.yml` (imports `playbooks/app/provision_vms.yml`)
|
||
- `configure_app.yml` (imports `playbooks/app/configure_app.yml`)
|
||
|
||
## High-level flow
|
||
|
||
When you run `playbooks/app/site.yml`, it imports two playbooks in order:
|
||
|
||
1. `playbooks/app/provision_vms.yml` (**Proxmox API changes happen here**)
|
||
2. `playbooks/app/configure_app.yml` (**SSH into guests and configure OS/app**)
|
||
|
||
## Variables that drive everything
|
||
|
||
All per-project/per-env inputs come from:
|
||
|
||
- `inventories/production/group_vars/all/main.yml` → `app_projects`
|
||
|
||
Each `app_projects.<project>.envs.<env>` contains:
|
||
|
||
- `name` (container hostname / inventory host name)
|
||
- `vmid` (Proxmox CTID)
|
||
- `ip` (static IP in CIDR form, e.g. `10.0.10.101/24`)
|
||
- `gateway` (e.g. `10.0.10.1`)
|
||
- `branch` (`dev`, `qa`, `main`)
|
||
- `env_vars` (key/value map written to `/srv/app/.env.<env>`)
|
||
|
||
Proxmox connection variables are also read from `inventories/production/group_vars/all/main.yml` but are usually vault-backed:
|
||
|
||
- `proxmox_host: "{{ vault_proxmox_host }}"`
|
||
- `proxmox_user: "{{ vault_proxmox_user }}"`
|
||
- `proxmox_node: "{{ vault_proxmox_node | default('pve') }}"`
|
||
|
||
## Phase 1: Provisioning via Proxmox API
|
||
|
||
### File chain
|
||
|
||
`playbooks/app/site.yml` imports `playbooks/app/provision_vms.yml`, which does:
|
||
|
||
- Validates `app_project` exists (if you passed one)
|
||
- Loops projects → includes `playbooks/app/provision_one_guest.yml`
|
||
- Loops envs inside the project → includes `playbooks/app/provision_one_env.yml`
|
||
|
||
### Preflight IP safety check
|
||
|
||
In `playbooks/app/provision_one_env.yml`:
|
||
|
||
- It runs `ping` against the target IP.
|
||
- If the IP responds, the play **fails** to prevent accidental duplicate-IP provisioning.
|
||
- You can override the guard (not recommended) with `-e allow_ip_conflicts=true`.
|
||
|
||
### What it creates/updates in Proxmox
|
||
|
||
In `playbooks/app/provision_one_env.yml` it calls role `roles/proxmox_vm` with LXC variables.
|
||
|
||
`roles/proxmox_vm/tasks/main.yml` dispatches:
|
||
|
||
- If `proxmox_guest_type == 'lxc'` → includes `roles/proxmox_vm/tasks/lxc.yml`
|
||
|
||
`roles/proxmox_vm/tasks/lxc.yml` performs:
|
||
|
||
1. **Build CT network config**
|
||
- Produces a `netif` dict like:
|
||
- `net0: name=eth0,bridge=vmbr0,firewall=1,ip=<CIDR>,gw=<GW>`
|
||
|
||
2. **Create/update the container**
|
||
- Uses `community.proxmox.proxmox` with:
|
||
- `state: present`
|
||
- `update: true` (so re-runs reconcile config)
|
||
- `vmid`, `hostname`, `ostemplate`, CPU/mem/swap, rootfs sizing, `netif`
|
||
- `pubkey` and optionally `password` for initial root access
|
||
|
||
3. **Start the container**
|
||
- Ensures `state: started` (if `lxc_start_after_create: true`)
|
||
|
||
4. **Wait for SSH**
|
||
- `wait_for: host=<ip> port=22`
|
||
|
||
### Dynamic inventory creation
|
||
|
||
Still in `playbooks/app/provision_one_env.yml`, it calls `ansible.builtin.add_host` so the guests become available to later plays:
|
||
|
||
- Adds the guest to groups:
|
||
- `app_all`
|
||
- `app_<project>_all`
|
||
- `app_<project>_<env>`
|
||
- Sets:
|
||
- `ansible_host` to the IP (without CIDR)
|
||
- `ansible_user: root` (bootstrap user for first config)
|
||
- `app_project`, `app_env` facts
|
||
|
||
## Phase 2: Configure OS + app on the guests
|
||
|
||
`playbooks/app/configure_app.yml` contains two plays:
|
||
|
||
### Play A: Build dynamic inventory (localhost)
|
||
|
||
This play exists so you can run `configure_app.yml` even if you didn’t run provisioning in the same Ansible invocation.
|
||
|
||
- It loops over projects/envs from `app_projects`
|
||
- Adds hosts to:
|
||
- `app_all`, `app_<project>_all`, `app_<project>_<env>`
|
||
- Uses:
|
||
- `ansible_user: "{{ app_bootstrap_user | default('root') }}"`
|
||
|
||
### Play B: Configure the hosts (SSH + sudo)
|
||
|
||
Targets:
|
||
|
||
- If you pass `-e app_project=projectA` → `hosts: app_projectA_all`
|
||
- Otherwise → `hosts: app_all`
|
||
|
||
Tasks executed on each guest:
|
||
|
||
1. **Resolve effective project/env variables**
|
||
- `project_def = app_projects[app_project]`
|
||
- `env_def = app_projects[app_project].envs[app_env]`
|
||
|
||
2. **Role: `base_os`** (`roles/base_os/tasks/main.yml`)
|
||
- Updates apt cache
|
||
- Installs baseline packages (git/curl/nodejs/npm/ufw/etc.)
|
||
- Creates `appuser` (passwordless sudo)
|
||
- Adds your SSH public key to `appuser`
|
||
- Enables UFW and allows:
|
||
- SSH (22)
|
||
- backend port (default `3001`, overridable per project)
|
||
- frontend port (default `3000`, overridable per project)
|
||
|
||
3. **Role: `app_setup`** (`roles/app_setup/tasks/main.yml`)
|
||
- Creates:
|
||
- `/srv/app`
|
||
- `/srv/app/backend`
|
||
- `/srv/app/frontend`
|
||
- Writes the env file:
|
||
- `/srv/app/.env.<dev|qa|prod>` from template `roles/app_setup/templates/env.j2`
|
||
- Writes the deploy script:
|
||
- `/usr/local/bin/deploy_app.sh` from `roles/app_setup/templates/deploy_app.sh.j2`
|
||
- Script does:
|
||
- `git clone` if missing
|
||
- `git checkout/pull` correct branch
|
||
- runs backend install + migrations
|
||
- runs frontend install + build
|
||
- restarts systemd services
|
||
- Writes systemd units:
|
||
- `/etc/systemd/system/app-backend.service` from `app-backend.service.j2`
|
||
- `/etc/systemd/system/app-frontend.service` from `app-frontend.service.j2`
|
||
- Reloads systemd and enables/starts both services
|
||
|
||
## What changes on first run vs re-run
|
||
|
||
- **Provisioning**:
|
||
- First run: creates CTs in Proxmox, sets static IP config, starts them.
|
||
- Re-run: reconciles settings because `update: true` is used.
|
||
- **Configuration**:
|
||
- Mostly idempotent (directories/templates/users/firewall/services converge).
|
||
|
||
## Common “before you run” checklist
|
||
|
||
- Confirm `app_projects` has correct IPs/CTIDs/branches:
|
||
- `inventories/production/group_vars/all/main.yml`
|
||
- Ensure vault has Proxmox + SSH key material:
|
||
- `inventories/production/group_vars/all/vault.yml`
|
||
- Reference template: `inventories/production/group_vars/all/vault.example.yml`
|
||
|
||
|