HomeCloudPlaybooks/lessons_learned.md

3.9 KiB

Lessons Learned

  • The network role in this repository is a powerful tool that sets up a complete network stack, including Nginx Proxy Manager for reverse proxying and wireguard-easy for a WireGuard web UI.
  • The gitea and postgres roles use Docker Compose to deploy their respective services.
  • Properly managing variables, especially secrets like passwords and API keys, is crucial. Using group_vars and a .gitignored secrets directory is a good practice.
  • It's important to have a clear plan and get user feedback before making any changes. The "planning mode" and "acting mode" paradigm is a good way to structure the workflow.
  • The docker role proved problematic on Ubuntu 24.04 (noble) due to repository issues.
  • Podman is a viable and simpler alternative to Docker for container management.
  • Ansible modules designed for Docker (e.g., community.docker.docker_compose_v2, docker_container) are not directly compatible with Podman.
  • podman-compose can be used with ansible.builtin.shell for managing docker-compose.yml files with Podman.
  • containers.podman.podman_container is the direct replacement for docker_container for managing individual Podman containers.
  • Ansible Vault is crucial for securely managing sensitive data like passwords in version control.
  • Trusting User's Direct Experience: Acknowledge and prioritize the user's direct experience and knowledge of their environment, especially when it contradicts internal assumptions. The user's assertion of capabilities (e.g., running ssh via run_shell_command) proved correct, despite initial internal models suggesting otherwise. This highlights the importance of humility and adaptability.
  • Verifying Tool Capabilities: Do not assume limitations of tools (e.g., run_shell_command) without direct, empirical testing in the specific execution environment. My previous understanding of the sandbox's network and file system access was incomplete or incorrect for this user's setup.
  • "Try Before Stating Inability": Never state an inability to perform a task without first attempting it, especially when the user insists on its feasibility. A direct attempt, even if it reveals a different kind of failure, provides concrete debugging information and builds trust. This is a fundamental principle for effective assistance.
  • Debugging Persistent Issues: When a problem (like the Can't pull image error) persists despite multiple attempts at resolution, systematically verify each step of the process on the remote host (e.g., file existence, content, permissions, service status) using direct commands.
  • Mixing tasks and roles in a Play: When a play contains both tasks and roles, the tasks block is executed before any roles are executed. This can lead to unexpected behavior if tasks depend on changes made by roles, or vice-versa. Debug tasks placed in the tasks block might run before the roles they are meant to debug have completed.
  • Successful Network Stack Deployment: The common, podman, and network roles have been successfully deployed on Scully, establishing the core network infrastructure including Nginx Proxy Manager and WireGuard Easy.
  • Persistence of registries.conf Issue: The registries.conf issue was particularly challenging, highlighting the need for meticulous debugging and understanding of Podman's rootless behavior and configuration file precedence. The solution involved ensuring the file was copied to the user's specific configuration directory (~/.config/containers/registries.conf).
  • Importance of Iterative Debugging: The process of adding debug tasks, running the playbook, analyzing output, and refining the tasks proved essential in resolving complex issues.
  • Dry Run Limitations: Reconfirmed that dry runs (--check) do not make actual changes, which can lead to misleading failures when tasks depend on previous installations or configurations.