Files
coder/codersdk/disconnect_test.go
Steven Masley 1afc6d4fd0 feat: structured disconnect attribution for agent logs (#25191)
Implements
[PLAT-60](https://linear.app/codercom/issue/PLAT-60/enhance-disconnect-logs-with-structured-reason-attribution):
adds structured disconnect attribution to disconnect logs throughout the
agent and tailnet packages.

Every disconnect log site now carries structured slog fields. All
existing logs remain; existing messages are preserved with the fields
added alongside.

New fields on disconnect log lines:

- `connect_type` — which layer disconnected: `server_to_agent`,
`agent_to_client`, or `client_to_server`
- `disconnect_reason` — categorical reason: `graceful`, `network_error`,
`server_shutdown`, etc.
- `disconnect_expected` — whether the disconnect is normal operation
(`true`) or should be investigated (`false`)
- `disconnect_initiator` — who started it: `client`, `agent`, `server`,
or `network` (control-plane sites only)
- `disconnect_detail` — free-form supplemental info (where useful)

## What's covered

**Control plane (`server_to_agent`):** coordination RPC, DERP map
subscriber, agent runLoop, agent Close, `BasicCoordination.Close`,
`Controller.run`.

**Data plane (`agent_to_client`):** SSH sessions, reconnecting PTY,
JetBrains port-forwarding.

<details>
<summary>Control-plane sites</summary>

| Site | Reason | Initiator |
|---|---|---|
| `agent/agent.go` `runLoop` EOF | `network_error` | `network` |
| `agent/agent.go` `runCoordinator` deferred exit | `server_shutdown` /
`graceful` / `network_error` | `agent` / `server` / `network` |
| `agent/agent.go` `runDERPMapSubscriber` deferred exit | same (shared
`classifyCoordinatorRPCExit`) | same |
| `agent/agent.go` `Close` shutdown timeout | `server_shutdown` + detail
| `agent` |
| `agent/agent.go` `Close` clean coord disconnect | `server_shutdown` |
`agent` |
| `tailnet/controllers.go` `BasicCoordination.Close` | `graceful` or
`network_error` | `c.initiator` |
| `tailnet/controllers.go` `Controller.run` `net.ErrClosed` |
`network_error` | `network` |

</details>

<details>
<summary>Data-plane sites</summary>

| Site | Reason | Notes |
|---|---|---|
| `agent/agentssh/agentssh.go` SSH session closed | free-form
(`graceful`, `process exited with error status: N`, etc.) | Also sets
`closeCause("normal exit")` for clean exits so coderd's
`connection_log.DisconnectReason` is no longer empty |
| `agent/reconnectingpty/server.go` PTY closed | `server_shutdown`,
error string, or `graceful` | |
| `agent/agentssh/jetbrainstrack.go` channel closed | `normal close` or
error string | Previously passed empty reason |

</details>

<details>
<summary>Bug fix</summary>

The deferred `disconnected from coordination RPC` log no longer fires
when the initial `Coordinate()` RPC call fails before any connection is
established.

</details>

Refs PLAT-60.

---

_This PR was prepared by Coder Agents on behalf of @Emyrk._
**Manually QA'd a lot of common disconnects**

---------

Co-authored-by: Coder Agents <noreply@coder.com>
2026-05-19 09:47:03 -05:00

95 lines
2.6 KiB
Go

package codersdk_test
import (
"testing"
"github.com/stretchr/testify/require"
"github.com/coder/coder/v2/codersdk"
)
func TestDisconnectReason_Valid(t *testing.T) {
t.Parallel()
cases := []struct {
reason codersdk.DisconnectReason
valid bool
}{
{codersdk.DisconnectReasonUnknown, true},
{codersdk.DisconnectReasonGraceful, true},
{codersdk.DisconnectReasonClientClosed, true},
{codersdk.DisconnectReasonServerShutdown, true},
{codersdk.DisconnectReasonNetworkError, true},
{codersdk.DisconnectReasonProtocolError, true},
{codersdk.DisconnectReasonWorkspaceStopped, true},
{codersdk.DisconnectReasonControlPlaneLost, true},
{codersdk.DisconnectReason("not_a_real_reason"), false},
}
for _, c := range cases {
require.Equal(t, c.valid, c.reason.Valid(), "reason=%q", c.reason)
}
}
func TestDisconnectReason_Expected(t *testing.T) {
t.Parallel()
expected := map[codersdk.DisconnectReason]bool{
codersdk.DisconnectReasonGraceful: true,
codersdk.DisconnectReasonClientClosed: true,
codersdk.DisconnectReasonServerShutdown: true,
codersdk.DisconnectReasonWorkspaceStopped: true,
codersdk.DisconnectReasonUnknown: false,
codersdk.DisconnectReasonNetworkError: false,
codersdk.DisconnectReasonProtocolError: false,
codersdk.DisconnectReasonControlPlaneLost: false,
}
for reason, want := range expected {
require.Equal(t, want, reason.Expected(), "reason=%q", reason)
}
// Unknown values default to not-expected so that uncategorized
// emit sites surface in the "investigate" bucket.
require.False(t, codersdk.DisconnectReason("not_a_real_reason").Expected())
}
func TestDisconnectInitiator_Valid(t *testing.T) {
t.Parallel()
cases := []struct {
initiator codersdk.DisconnectInitiator
valid bool
}{
{codersdk.DisconnectInitiatorUnknown, true},
{codersdk.DisconnectInitiatorClient, true},
{codersdk.DisconnectInitiatorAgent, true},
{codersdk.DisconnectInitiatorServer, true},
{codersdk.DisconnectInitiatorNetwork, true},
{codersdk.DisconnectInitiator("nobody"), false},
}
for _, c := range cases {
require.Equal(t, c.valid, c.initiator.Valid(), "initiator=%q", c.initiator)
}
}
func TestConnectionMethod_Valid(t *testing.T) {
t.Parallel()
cases := []struct {
method codersdk.ConnectionMethod
valid bool
}{
{codersdk.ConnectionMethodUnknown, true},
{codersdk.ConnectionMethodDirect, true},
{codersdk.ConnectionMethodDERP, true},
{codersdk.ConnectionMethod("magic"), false},
}
for _, c := range cases {
require.Equal(t, c.valid, c.method.Valid(), "method=%q", c.method)
}
}