fix(dogfood/coder): tolerate stale gh auth state (#23588)

## Problem

The dogfood startup script uses `gh auth status` to decide whether to
re-authenticate the GitHub CLI. That command exits non-zero when **any**
stored credential is invalid—even if Coder external auth already injects
a working `GITHUB_TOKEN` into the environment and `gh` commands work
fine.

On workspaces with a persistent home volume, `~/.config/gh/hosts.yml`
retains OAuth tokens written by previous `gh auth login --with-token`
calls. These tokens are issued by Coder's external auth integration and
can be rotated or revoked between workspace starts, but the copy in
`hosts.yml` persists on the volume. When the stored token goes stale,
`gh auth status` reports two accounts:

```
✓ Logged in to github.com account user (GITHUB_TOKEN)           ← works fine
✗ Failed to log in to github.com account user (hosts.yml)       ← stale token
```

It exits 1 because of the stale entry, even though `gh` API calls
succeed via `GITHUB_TOKEN`. This makes the auth state **indeterminate**
from `gh auth status` alone—you can't tell whether `gh` actually works
or not.

When the script enters the login branch:

1. `gh auth login --with-token` **refuses** to accept piped input when
`GITHUB_TOKEN` is already set in the environment, and exits 1.
2. `set -e` kills the script before it reaches `sudo service docker
start`.

The result: Docker never starts, devcontainer health checks fail, and
the workspace reports a startup error—all because of a stale GitHub CLI
credential that has no bearing on workspace functionality.

## Fix

- Switch the auth guard from `gh auth status` to `gh api user --jq
.login`, which tests whether GitHub API access actually works regardless
of which credential provides it.
- Wrap the fallback `gh auth login` so a failure logs the indeterminate
state but does not abort the script.
This commit is contained in:
Ethan
2026-03-26 17:25:42 +11:00
committed by GitHub
parent 61e31ec5cc
commit 411714cd73
+9 -4
View File
@@ -579,12 +579,17 @@ resource "coder_agent" "dev" {
trap cleanup EXIT
coder exp sync start agent-startup
# Authenticate GitHub CLI
if ! gh auth status >/dev/null 2>&1; then
# Authenticate GitHub CLI. `gh api user` is used instead of `gh auth
# status` because the latter exits non-zero when a stale token exists
# in ~/.config/gh/hosts.yml, even when a valid GITHUB_TOKEN is already
# present in the environment and gh commands work fine.
if ! gh api user --jq .login >/dev/null 2>&1; then
echo "Logging into GitHub CLI…"
coder external-auth access-token github | gh auth login --hostname github.com --with-token
if ! coder external-auth access-token github | gh auth login --hostname github.com --with-token; then
echo "GitHub CLI authentication failed; gh commands may not work."
fi
else
echo "Already logged into GitHub CLI."
echo "GitHub CLI already has working credentials."
fi
# Configure Mux GitHub owner login for browser access (skip if
# already set). See: https://mux.coder.com/config/server-access