HomeCloudPlaybooks/lessons_learned.md

4.1 KiB

Lessons Learned

  • The network role in this repository is a powerful tool that sets up a complete network stack, including Nginx Proxy Manager for reverse proxying and wireguard-easy for a WireGuard web UI.

  • The gitea and postgres roles use Docker Compose to deploy their respective services.

  • Properly managing variables, especially secrets like passwords and API keys, is crucial. Using group_vars and a .gitignored secrets directory is a good practice.

  • It's important to have a clear plan and get user feedback before making any changes. The "planning mode" and "acting mode" paradigm is a good way to structure the workflow.

  • The docker role proved problematic on Ubuntu 24.04 (noble) due to repository issues.

  • Podman is a viable and simpler alternative to Docker for container management.

  • Ansible modules designed for Docker (e.g., community.docker.docker_compose_v2, docker_container) are not directly compatible with Podman.

  • podman-compose can be used with ansible.builtin.shell for managing docker-compose.yml files with Podman.

  • containers.podman.podman_container is the direct replacement for docker_container for managing individual Podman containers.

  • Ansible Vault is crucial for securely managing sensitive data like passwords in version control.

  • General Debugging Principles:

    • Always trust the user's direct experience and observations, even if they initially contradict assumptions or playbook output.
    • When a playbook reports success but the desired state isn't met, investigate deeper. Ansible's changed status can be misleading if the underlying application fails after the module reports success.
    • Use increased verbosity (-vvv) for detailed debugging output from Ansible.
    • Systematically verify each layer of the stack (container logs, host processes, host firewall, cloud firewall).
  • Podman Specifics & Rootless Containers:

    • Rootless Podman requires tasks managing user-specific files and containers to explicitly use become: false.
    • Using ~ (tilde) in paths for user home directories is more robust than relying on ansible_user_dir, which can sometimes resolve unexpectedly.
    • Fully qualifying image names (e.g., docker.io/portainer/portainer-ce) prevents registry ambiguity issues and avoids interactive prompts.
    • Debugging container startup issues requires checking:
      • podman ps -a (to see all containers, running or exited).
      • podman logs <container_name> (to get application logs).
      • sudo podman ps (to check for rootful containers that might be interfering).
    • Orphaned conmon processes from failed container startups can block ports and require manual cleanup (sudo kill <PID>, podman stop/rm).
    • Ensure registries.conf is correctly templated (ansible.builtin.template, not ansible.builtin.copy) and placed in ~/.config/containers/registries.conf for rootless Podman.
    • Verify Podman's actual listening port on the host with sudo ss -tulnp | grep <port> (or lsof).
  • Ansible Best Practices:

    • Idempotency is paramount: Always strive for idempotent tasks that describe the desired state (e.g., ansible.posix.firewalld, ansible.builtin.service) rather than imperative shell commands.
    • Ensure all necessary Python libraries (python3-firewall) and system services (firewalld) are installed and running on target hosts before modules that depend on them are called.
    • Explicitly set become: false on tasks that should run as the connecting user, especially when the play has become: true by default.
    • The ansible.builtin.template module must be used for Jinja2 templates; ansible.builtin.copy does not process templates.
  • Networking & Cloud Considerations:

    • Host firewall (firewalld) rules are separate from cloud provider security rules (e.g., Oracle Cloud Network Security Groups/Security Lists). Both layers must be correctly configured.
    • Ansible playbooks typically cannot manage cloud provider firewalls without specific cloud collections (e.g., oracle.oci).