mirror of
https://github.com/coder/coder.git
synced 2026-06-03 04:58:23 +00:00
584c61acb5
## Problem Workspaces showed as "Healthy" immediately after creation while the agent was still downloading, starting, or connecting. If the agent never connected, the workspace stayed "Healthy" for the entire connection timeout (~120s), then abruptly flipped to "Unhealthy". ## Root cause In `db2sdk.WorkspaceAgent`, the health switch had no case for `WorkspaceAgentConnecting`. Agents in `connecting` status with a non-`off` lifecycle (e.g. `created` after a fresh build) fell through to the `default` case and were marked `Healthy = true`. ## Fix Add an explicit case for `WorkspaceAgentConnecting` that sets `Healthy = false` with reason `"agent has not yet connected"`. The case is placed after the existing `!connected + off` case (which correctly catches stopped agents as "not running") and before the `timeout`/`disconnected` cases. ``` Status + Lifecycle → Health reason ────────────────────────────────────────────────────── any !connected + off → "agent is not running" connecting + created/starting → "agent has not yet connected" ← NEW timeout + any → "agent is taking too long to connect" disconnected + any → "agent has lost connection" connected + start_error → "agent startup script exited with an error" connected + shutting_down → "agent is shutting down" connected + ready/starting → healthy ``` The frontend already handles this case — `getAgentHealthIssue()` returns "Workspace agent is still connecting" with `severity: "info"` for unhealthy workspaces with connecting agents. ## Test changes - **Healthy test**: now actually connects the agent via `agenttest.New` before asserting health (previously passed due to the bug). - **New Connecting test**: verifies a never-connected agent is correctly marked unhealthy. - **Mixed health test**: connects a1 and waits for the mixed state (`a1.Healthy && !workspace.Healthy`) to avoid a race where both agents are initially connecting. - **Sub-agent excluded test**: connects the parent agent and waits for it to be healthy before creating the sub-agent. - **TestWorkspaceAgent/Connect**: flipped assertion to `Health.Healthy == false` for a `dbfake` agent that never connects. <details> <summary>Review notes</summary> ### Known follow-up The `healthy:false` workspace search filter maps to `[disconnected, timeout]` and does not include `connecting`. This is a pre-existing gap that is now more consequential — a workspace unhealthy solely due to a connecting agent won't appear in `healthy:false` results. Worth a follow-up issue. ### Deep review findings addressed | Finding | Severity | Status | |---------|----------|--------| | Mixed health test race (all 3 reviewers) | P2 | Fixed — tightened `Eventually` condition | | `TestWorkspaceAgent/Connect` assertion break | P1 | Fixed — flipped assertion | | CLI renders red for connecting agents | Obs | Acknowledged — design trade-off, accurate but visually strong for transient state | | Switch case ordering overlap | Obs | Documented with inline comment | </details> > 🤖 This PR was created with the help of Coder Agents, and needs a human review. 🧑💻