Two issues caused TLS to break on photos.carabosse.cloud over IPv6
(GrapheneOS + Immich app via Orange 5G NAT64):
1. Per-service vhosts only listened on IPv4 (listen 443 ssl). On IPv6,
nginx fell back to the first vhost loaded alphabetically and served
its certificate, breaking hostname verification on every other vhost.
2. /etc/letsencrypt/{live,archive} were 0700 root:root after certbot
created them, so the nginx worker (user http on Arch) could not read
the chained intermediates and served the leaf-only chain.
Changes:
- Add catch-all 00-default.conf default_server on :80 and :443 (v4+v6)
with a self-signed cert and 'return 444'. ACME challenges still
answered on :80.
- Add IPv6 listeners ([::]:80 and [::]:443 ssl) to immich, gitea, ntfy,
uptime_kuma vhosts and to the temporary ACME provisioning vhost.
- Apply 0755 on /etc/letsencrypt/live and /etc/letsencrypt/archive on
every run, not only at initial cert provisioning.
The previous Type=oneshot + RemainAfterExit=true pattern made systemd
freeze pod units in 'active (exited)' as soon as 'podman play kube'
returned, so crash-looping containers were invisible to
'systemctl --user --failed' and Restart=on-failure never fired.
For every podman-pod role (immich, fdroid, ntfy, gitea, qfieldcloud,
unifi, matrix, uptime_kuma):
- switch units to Type=notify + NotifyAccess=all
- run 'podman kube play --service-container=true' so the unit's main
PID stays alive as long as the pod
- use 'podman kube down' for ExecStop
- add TimeoutStartSec=180 to cover slow first-boot image pulls
Pod manifests: flip every container's restartPolicy from Always to
Never. systemd is now the single owner of the restart loop: container
exits -> pod dies -> service container dies -> unit fails ->
Restart=on-failure restarts everything cleanly. With Always, podman
retried internally and hid the failure from systemd.
CLAUDE.md updated to document the new canonical template and the
'restartPolicy: Never' requirement.