docs: Update lessons_learned.md with debugging insights.
This commit is contained in:
parent
a67fb3c039
commit
7ec6b429c2
@ -10,12 +10,30 @@
|
||||
* `podman-compose` can be used with `ansible.builtin.shell` for managing `docker-compose.yml` files with Podman.
|
||||
* `containers.podman.podman_container` is the direct replacement for `docker_container` for managing individual Podman containers.
|
||||
* Ansible Vault is crucial for securely managing sensitive data like passwords in version control.
|
||||
* **Trusting User's Direct Experience:** Acknowledge and prioritize the user's direct experience and knowledge of their environment, especially when it contradicts internal assumptions. The user's assertion of capabilities (e.g., running `ssh` via `run_shell_command`) proved correct, despite initial internal models suggesting otherwise. This highlights the importance of humility and adaptability.
|
||||
* **Verifying Tool Capabilities:** Do not assume limitations of tools (e.g., `run_shell_command`) without direct, empirical testing in the specific execution environment. My previous understanding of the sandbox's network and file system access was incomplete or incorrect for this user's setup.
|
||||
* **"Try Before Stating Inability":** Never state an inability to perform a task without first attempting it, especially when the user insists on its feasibility. A direct attempt, even if it reveals a different kind of failure, provides concrete debugging information and builds trust. This is a fundamental principle for effective assistance.
|
||||
* **Debugging Persistent Issues:** When a problem (like the `Can't pull image` error) persists despite multiple attempts at resolution, systematically verify each step of the process on the remote host (e.g., file existence, content, permissions, service status) using direct commands.
|
||||
* **Mixing `tasks` and `roles` in a Play:** When a play contains both `tasks` and `roles`, the `tasks` block is executed *before* any `roles` are executed. This can lead to unexpected behavior if tasks depend on changes made by roles, or vice-versa. Debug tasks placed in the `tasks` block might run before the roles they are meant to debug have completed.
|
||||
* **Successful Network Stack Deployment:** The `common`, `podman`, and `network` roles have been successfully deployed on Scully, establishing the core network infrastructure including Nginx Proxy Manager and WireGuard Easy.
|
||||
* **Persistence of `registries.conf` Issue:** The `registries.conf` issue was particularly challenging, highlighting the need for meticulous debugging and understanding of Podman's rootless behavior and configuration file precedence. The solution involved ensuring the file was copied to the user's specific configuration directory (`~/.config/containers/registries.conf`).
|
||||
* **Importance of Iterative Debugging:** The process of adding debug tasks, running the playbook, analyzing output, and refining the tasks proved essential in resolving complex issues.
|
||||
* **Dry Run Limitations:** Reconfirmed that dry runs (`--check`) do not make actual changes, which can lead to misleading failures when tasks depend on previous installations or configurations.
|
||||
* **General Debugging Principles:**
|
||||
* Always trust the user's direct experience and observations, even if they initially contradict assumptions or playbook output.
|
||||
* When a playbook reports success but the desired state isn't met, investigate deeper. Ansible's `changed` status can be misleading if the underlying application fails after the module reports success.
|
||||
* Use increased verbosity (`-vvv`) for detailed debugging output from Ansible.
|
||||
* Systematically verify each layer of the stack (container logs, host processes, host firewall, cloud firewall).
|
||||
|
||||
* **Podman Specifics & Rootless Containers:**
|
||||
* Rootless Podman requires tasks managing user-specific files and containers to explicitly use `become: false`.
|
||||
* Using `~` (tilde) in paths for user home directories is more robust than relying on `ansible_user_dir`, which can sometimes resolve unexpectedly.
|
||||
* Fully qualifying image names (e.g., `docker.io/portainer/portainer-ce`) prevents registry ambiguity issues and avoids interactive prompts.
|
||||
* Debugging container startup issues requires checking:
|
||||
* `podman ps -a` (to see all containers, running or exited).
|
||||
* `podman logs <container_name>` (to get application logs).
|
||||
* `sudo podman ps` (to check for rootful containers that might be interfering).
|
||||
* Orphaned `conmon` processes from failed container startups can block ports and require manual cleanup (`sudo kill <PID>`, `podman stop/rm`).
|
||||
* Ensure `registries.conf` is correctly templated (`ansible.builtin.template`, not `ansible.builtin.copy`) and placed in `~/.config/containers/registries.conf` for rootless Podman.
|
||||
* Verify Podman's actual listening port on the host with `sudo ss -tulnp | grep <port>` (or `lsof`).
|
||||
|
||||
* **Ansible Best Practices:**
|
||||
* **Idempotency is paramount:** Always strive for idempotent tasks that describe the desired state (e.g., `ansible.posix.firewalld`, `ansible.builtin.service`) rather than imperative shell commands.
|
||||
* Ensure all necessary Python libraries (`python3-firewall`) and system services (`firewalld`) are installed and running on target hosts *before* modules that depend on them are called.
|
||||
* Explicitly set `become: false` on tasks that should run as the connecting user, especially when the play has `become: true` by default.
|
||||
* The `ansible.builtin.template` module must be used for Jinja2 templates; `ansible.builtin.copy` does not process templates.
|
||||
|
||||
* **Networking & Cloud Considerations:**
|
||||
* Host firewall (`firewalld`) rules are separate from cloud provider security rules (e.g., Oracle Cloud Network Security Groups/Security Lists). Both layers must be correctly configured.
|
||||
* Ansible playbooks typically cannot manage cloud provider firewalls without specific cloud collections (e.g., `oracle.oci`).
|
||||
Loading…
x
Reference in New Issue
Block a user