Docker Compose Login Fails with 500 After Reboot
This runbook captures the recurring auth failures we see right after bringing the Docker Compose stack up on a freshly rebooted dev host.
What Happens
- Proton generates fresh 2048-bit DH params on startup and binds HTTPS ports only after that work finishes.
- While Proton is still generating, core continuously restarts because the
migrations cannot reach Proton on
:9440. - The web UI and Kong are up, but every
/auth/login(and the JWKS fetch) hits core while it is still restarting, resulting in HTTP 500 responses.
How to Recognize It
docker compose psshowsserviceradar-protonasunhealthyandserviceradar-corestuck inhealth: starting.docker compose logs core | tailcontains lines like:Fatal error: database error: failed to run database migrations:
failed to create migrations table: dial tcp 172.18.0.7:9440:
connect: connection refuseddocker compose logs proton --tail 20is stuck on:[Proton Init] Generating DH parameters (this may take a few minutes for security)...
Mitigation
- The
proton-init.shentrypoint now caches the generated DH params under/var/lib/proton/dhparam.pem, so you only pay the cost once per volume. Make sure your compose stack is up to date:The first start may take 6–10 minutes while the params are created; after that, restarts reuse the cached file and come up immediately.docker compose up -d proton - While waiting, you can watch for the container to flip to
healthy:docker compose ps proton core - Once Proton is healthy, core finishes migrations on the next attempt and the login endpoint starts returning 200s. No manual restarts are required.
UI Port Reminder
By default the compose nginx binds to host port 80. If the port is already
occupied on your host, either stop the conflicting service or set
SERVICERADAR_HTTP_PORT=<alternate> before running docker compose up.
The common culprit is the distro’s own nginx service; disable it once and the
stack will reuse port 80 every boot:
sudo systemctl disable --now nginx
docker compose up -d nginx
Longer-Term Fix Ideas
- Teach the Proton entrypoint to reuse DH params stored in the shared
cert-datavolume so every restart does not regenerate them. - Consider precomputing the DH params as part of the image build to shorten the first-boot delay.
- Update the web UI to surface a clearer “core still starting” banner instead of surfacing raw 500 errors during the warm-up window.