ansible/docs/guides/app_stack_execution_flow.md
ilia 69a39e5e5b Add POTE app project support and improve IP conflict detection (#3)
## Summary

This PR adds comprehensive support for deploying the **POTE** application project via Ansible, along with improvements to IP conflict detection and a new app stack provisioning system for Proxmox-managed LXC containers.

## Key Features

### 🆕 New Roles
- **`roles/pote`**: Python/venv deployment role for POTE (PostgreSQL, cron jobs, Alembic migrations)
- **`roles/app_setup`**: Generic app deployment role (Node.js/systemd)
- **`roles/base_os`**: Base OS hardening role

### 🛡️ Safety Improvements
- IP uniqueness validation within projects
- Proxmox-side IP conflict detection
- Enhanced error messages for IP conflicts

### 📦 New Playbooks
- `playbooks/app/site.yml`: End-to-end app stack deployment
- `playbooks/app/provision_vms.yml`: Proxmox guest provisioning
- `playbooks/app/configure_app.yml`: OS + application configuration

## Security
-  All secrets stored in encrypted vault.yml
-  Deploy keys excluded via .gitignore
-  No plaintext secrets committed

## Testing
-  POTE successfully deployed to dev/qa/prod environments
-  All components validated (Git, PostgreSQL, cron, migrations)

Co-authored-by: ilia <ilia@levkin.ca>
Reviewed-on: #3
2026-01-01 11:19:54 -05:00

6.0 KiB
Raw Blame History

App stack execution flow (what happens when you run it)

This document describes exactly what Ansible runs and what it changes when you execute the Proxmox app stack playbooks.

Entry points

  • Recommended end-to-end run:
    • playbooks/app/site.yml
  • Repo-root wrappers (equivalent):
    • site.yml (imports playbooks/site.yml, and you can --tags app)
    • provision_vms.yml (imports playbooks/app/provision_vms.yml)
    • configure_app.yml (imports playbooks/app/configure_app.yml)

High-level flow

When you run playbooks/app/site.yml, it imports two playbooks in order:

  1. playbooks/app/provision_vms.yml (Proxmox API changes happen here)
  2. playbooks/app/configure_app.yml (SSH into guests and configure OS/app)

Variables that drive everything

All per-project/per-env inputs come from:

  • inventories/production/group_vars/all/main.ymlapp_projects

Each app_projects.<project>.envs.<env> contains:

  • name (container hostname / inventory host name)
  • vmid (Proxmox CTID)
  • ip (static IP in CIDR form, e.g. 10.0.10.101/24)
  • gateway (e.g. 10.0.10.1)
  • branch (dev, qa, main)
  • env_vars (key/value map written to /srv/app/.env.<env>)

Proxmox connection variables are also read from inventories/production/group_vars/all/main.yml but are usually vault-backed:

  • proxmox_host: "{{ vault_proxmox_host }}"
  • proxmox_user: "{{ vault_proxmox_user }}"
  • proxmox_node: "{{ vault_proxmox_node | default('pve') }}"

Phase 1: Provisioning via Proxmox API

File chain

playbooks/app/site.yml imports playbooks/app/provision_vms.yml, which does:

  • Validates app_project exists (if you passed one)
  • Loops projects → includes playbooks/app/provision_one_guest.yml
  • Loops envs inside the project → includes playbooks/app/provision_one_env.yml

Preflight IP safety check

In playbooks/app/provision_one_env.yml:

  • It runs ping against the target IP.
  • If the IP responds, the play fails to prevent accidental duplicate-IP provisioning.
  • You can override the guard (not recommended) with -e allow_ip_conflicts=true.

What it creates/updates in Proxmox

In playbooks/app/provision_one_env.yml it calls role roles/proxmox_vm with LXC variables.

roles/proxmox_vm/tasks/main.yml dispatches:

  • If proxmox_guest_type == 'lxc' → includes roles/proxmox_vm/tasks/lxc.yml

roles/proxmox_vm/tasks/lxc.yml performs:

  1. Build CT network config

    • Produces a netif dict like:
      • net0: name=eth0,bridge=vmbr0,firewall=1,ip=<CIDR>,gw=<GW>
  2. Create/update the container

    • Uses community.proxmox.proxmox with:
      • state: present
      • update: true (so re-runs reconcile config)
      • vmid, hostname, ostemplate, CPU/mem/swap, rootfs sizing, netif
      • pubkey and optionally password for initial root access
  3. Start the container

    • Ensures state: started (if lxc_start_after_create: true)
  4. Wait for SSH

    • wait_for: host=<ip> port=22

Dynamic inventory creation

Still in playbooks/app/provision_one_env.yml, it calls ansible.builtin.add_host so the guests become available to later plays:

  • Adds the guest to groups:
    • app_all
    • app_<project>_all
    • app_<project>_<env>
  • Sets:
    • ansible_host to the IP (without CIDR)
    • ansible_user: root (bootstrap user for first config)
    • app_project, app_env facts

Phase 2: Configure OS + app on the guests

playbooks/app/configure_app.yml contains two plays:

Play A: Build dynamic inventory (localhost)

This play exists so you can run configure_app.yml even if you didnt run provisioning in the same Ansible invocation.

  • It loops over projects/envs from app_projects
  • Adds hosts to:
    • app_all, app_<project>_all, app_<project>_<env>
  • Uses:
    • ansible_user: "{{ app_bootstrap_user | default('root') }}"

Play B: Configure the hosts (SSH + sudo)

Targets:

  • If you pass -e app_project=projectAhosts: app_projectA_all
  • Otherwise → hosts: app_all

Tasks executed on each guest:

  1. Resolve effective project/env variables

    • project_def = app_projects[app_project]
    • env_def = app_projects[app_project].envs[app_env]
  2. Role: base_os (roles/base_os/tasks/main.yml)

    • Updates apt cache
    • Installs baseline packages (git/curl/nodejs/npm/ufw/etc.)
    • Creates appuser (passwordless sudo)
    • Adds your SSH public key to appuser
    • Enables UFW and allows:
      • SSH (22)
      • backend port (default 3001, overridable per project)
      • frontend port (default 3000, overridable per project)
  3. Role: app_setup (roles/app_setup/tasks/main.yml)

    • Creates:
      • /srv/app
      • /srv/app/backend
      • /srv/app/frontend
    • Writes the env file:
      • /srv/app/.env.<dev|qa|prod> from template roles/app_setup/templates/env.j2
    • Writes the deploy script:
      • /usr/local/bin/deploy_app.sh from roles/app_setup/templates/deploy_app.sh.j2
      • Script does:
        • git clone if missing
        • git checkout/pull correct branch
        • runs backend install + migrations
        • runs frontend install + build
        • restarts systemd services
    • Writes systemd units:
      • /etc/systemd/system/app-backend.service from app-backend.service.j2
      • /etc/systemd/system/app-frontend.service from app-frontend.service.j2
    • Reloads systemd and enables/starts both services

What changes on first run vs re-run

  • Provisioning:
    • First run: creates CTs in Proxmox, sets static IP config, starts them.
    • Re-run: reconciles settings because update: true is used.
  • Configuration:
    • Mostly idempotent (directories/templates/users/firewall/services converge).

Common “before you run” checklist

  • Confirm app_projects has correct IPs/CTIDs/branches:
    • inventories/production/group_vars/all/main.yml
  • Ensure vault has Proxmox + SSH key material:
    • inventories/production/group_vars/all/vault.yml
    • Reference template: inventories/production/group_vars/all/vault.example.yml