ansible/docs/guides/app_stack_execution_flow.md
ilia f5e32afd81
Some checks failed
CI / lint-and-test (pull_request) Failing after 1m20s
CI / ansible-validation (pull_request) Successful in 6m40s
CI / secret-scanning (pull_request) Successful in 2m36s
CI / dependency-scan (pull_request) Successful in 6m12s
CI / sast-scan (pull_request) Successful in 6m48s
CI / license-check (pull_request) Successful in 1m16s
CI / vault-check (pull_request) Failing after 6m13s
CI / playbook-test (pull_request) Successful in 6m34s
CI / container-scan (pull_request) Successful in 6m57s
CI / sonar-analysis (pull_request) Failing after 1m10s
CI / workflow-summary (pull_request) Successful in 1m11s
Add POTE app project support and improve IP conflict detection
- Add roles/pote: Python/venv deployment role with PostgreSQL, cron jobs
- Add playbooks/app/: Proxmox app stack provisioning and configuration
- Add roles/app_setup: Generic app deployment role (Node.js/systemd)
- Add roles/base_os: Base OS hardening role
- Enhance roles/proxmox_vm: Split LXC/KVM tasks, improve error handling
- Add IP uniqueness validation: Preflight check for duplicate IPs within projects
- Add Proxmox-side IP conflict detection: Check existing LXC net0 configs
- Update inventories/production/group_vars/all/main.yml: Add pote project config
- Add vault.example.yml: Template for POTE secrets (git key, DB, SMTP)
- Update .gitignore: Exclude deploy keys, backup files, and other secrets
- Update documentation: README, role docs, execution flow guides

Security:
- All secrets stored in encrypted vault.yml (never committed in plaintext)
- Deploy keys excluded via .gitignore
- IP conflict guardrails prevent accidental duplicate IP assignments
2025-12-28 20:52:45 -05:00

174 lines
6.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# App stack execution flow (what happens when you run it)
This document describes **exactly** what Ansible runs and what it changes when you execute the Proxmox app stack playbooks.
## Entry points
- Recommended end-to-end run:
- `playbooks/app/site.yml`
- Repo-root wrappers (equivalent):
- `site.yml` (imports `playbooks/site.yml`, and you can `--tags app`)
- `provision_vms.yml` (imports `playbooks/app/provision_vms.yml`)
- `configure_app.yml` (imports `playbooks/app/configure_app.yml`)
## High-level flow
When you run `playbooks/app/site.yml`, it imports two playbooks in order:
1. `playbooks/app/provision_vms.yml` (**Proxmox API changes happen here**)
2. `playbooks/app/configure_app.yml` (**SSH into guests and configure OS/app**)
## Variables that drive everything
All per-project/per-env inputs come from:
- `inventories/production/group_vars/all/main.yml``app_projects`
Each `app_projects.<project>.envs.<env>` contains:
- `name` (container hostname / inventory host name)
- `vmid` (Proxmox CTID)
- `ip` (static IP in CIDR form, e.g. `10.0.10.101/24`)
- `gateway` (e.g. `10.0.10.1`)
- `branch` (`dev`, `qa`, `main`)
- `env_vars` (key/value map written to `/srv/app/.env.<env>`)
Proxmox connection variables are also read from `inventories/production/group_vars/all/main.yml` but are usually vault-backed:
- `proxmox_host: "{{ vault_proxmox_host }}"`
- `proxmox_user: "{{ vault_proxmox_user }}"`
- `proxmox_node: "{{ vault_proxmox_node | default('pve') }}"`
## Phase 1: Provisioning via Proxmox API
### File chain
`playbooks/app/site.yml` imports `playbooks/app/provision_vms.yml`, which does:
- Validates `app_project` exists (if you passed one)
- Loops projects → includes `playbooks/app/provision_one_guest.yml`
- Loops envs inside the project → includes `playbooks/app/provision_one_env.yml`
### Preflight IP safety check
In `playbooks/app/provision_one_env.yml`:
- It runs `ping` against the target IP.
- If the IP responds, the play **fails** to prevent accidental duplicate-IP provisioning.
- You can override the guard (not recommended) with `-e allow_ip_conflicts=true`.
### What it creates/updates in Proxmox
In `playbooks/app/provision_one_env.yml` it calls role `roles/proxmox_vm` with LXC variables.
`roles/proxmox_vm/tasks/main.yml` dispatches:
- If `proxmox_guest_type == 'lxc'` → includes `roles/proxmox_vm/tasks/lxc.yml`
`roles/proxmox_vm/tasks/lxc.yml` performs:
1. **Build CT network config**
- Produces a `netif` dict like:
- `net0: name=eth0,bridge=vmbr0,firewall=1,ip=<CIDR>,gw=<GW>`
2. **Create/update the container**
- Uses `community.proxmox.proxmox` with:
- `state: present`
- `update: true` (so re-runs reconcile config)
- `vmid`, `hostname`, `ostemplate`, CPU/mem/swap, rootfs sizing, `netif`
- `pubkey` and optionally `password` for initial root access
3. **Start the container**
- Ensures `state: started` (if `lxc_start_after_create: true`)
4. **Wait for SSH**
- `wait_for: host=<ip> port=22`
### Dynamic inventory creation
Still in `playbooks/app/provision_one_env.yml`, it calls `ansible.builtin.add_host` so the guests become available to later plays:
- Adds the guest to groups:
- `app_all`
- `app_<project>_all`
- `app_<project>_<env>`
- Sets:
- `ansible_host` to the IP (without CIDR)
- `ansible_user: root` (bootstrap user for first config)
- `app_project`, `app_env` facts
## Phase 2: Configure OS + app on the guests
`playbooks/app/configure_app.yml` contains two plays:
### Play A: Build dynamic inventory (localhost)
This play exists so you can run `configure_app.yml` even if you didnt run provisioning in the same Ansible invocation.
- It loops over projects/envs from `app_projects`
- Adds hosts to:
- `app_all`, `app_<project>_all`, `app_<project>_<env>`
- Uses:
- `ansible_user: "{{ app_bootstrap_user | default('root') }}"`
### Play B: Configure the hosts (SSH + sudo)
Targets:
- If you pass `-e app_project=projectA``hosts: app_projectA_all`
- Otherwise → `hosts: app_all`
Tasks executed on each guest:
1. **Resolve effective project/env variables**
- `project_def = app_projects[app_project]`
- `env_def = app_projects[app_project].envs[app_env]`
2. **Role: `base_os`** (`roles/base_os/tasks/main.yml`)
- Updates apt cache
- Installs baseline packages (git/curl/nodejs/npm/ufw/etc.)
- Creates `appuser` (passwordless sudo)
- Adds your SSH public key to `appuser`
- Enables UFW and allows:
- SSH (22)
- backend port (default `3001`, overridable per project)
- frontend port (default `3000`, overridable per project)
3. **Role: `app_setup`** (`roles/app_setup/tasks/main.yml`)
- Creates:
- `/srv/app`
- `/srv/app/backend`
- `/srv/app/frontend`
- Writes the env file:
- `/srv/app/.env.<dev|qa|prod>` from template `roles/app_setup/templates/env.j2`
- Writes the deploy script:
- `/usr/local/bin/deploy_app.sh` from `roles/app_setup/templates/deploy_app.sh.j2`
- Script does:
- `git clone` if missing
- `git checkout/pull` correct branch
- runs backend install + migrations
- runs frontend install + build
- restarts systemd services
- Writes systemd units:
- `/etc/systemd/system/app-backend.service` from `app-backend.service.j2`
- `/etc/systemd/system/app-frontend.service` from `app-frontend.service.j2`
- Reloads systemd and enables/starts both services
## What changes on first run vs re-run
- **Provisioning**:
- First run: creates CTs in Proxmox, sets static IP config, starts them.
- Re-run: reconciles settings because `update: true` is used.
- **Configuration**:
- Mostly idempotent (directories/templates/users/firewall/services converge).
## Common “before you run” checklist
- Confirm `app_projects` has correct IPs/CTIDs/branches:
- `inventories/production/group_vars/all/main.yml`
- Ensure vault has Proxmox + SSH key material:
- `inventories/production/group_vars/all/vault.yml`
- Reference template: `inventories/production/group_vars/all/vault.example.yml`