Files
coder/coderd
Kyle Carberry b3d5b8d13c fix: stabilize flaky chatd subscribe/promote queued tests (#23816)
## Summary

Fixes three flaky chatd tests that intermittently fail due to timing
races with the background run loop.

Closes coder/internal#1428

## Root Cause

`CreateChat` and `PromoteQueued` call `signalWake()` which writes to
`wakeCh`, triggering `processOnce` immediately. Even though
`newTestServer` sets `PendingChatAcquireInterval: testutil.WaitLong` to
prevent ticker-based polling, the wake channel bypasses this. This
causes `processOnce` to acquire and process the chat concurrently with
the test's manual DB updates and assertions.

### Failing tests

| Test | Failure | Cause |
|------|---------|-------|
| `TestPromoteQueuedAllowsAlreadyQueuedMessageWhenUsageLimitReached` |
`expected: "pending", actual: "running"` | Wake from `CreateChat` races
with manual `UpdateChatStatus`; wake from `PromoteQueued` acquires the
chat before the status assertion |
| `TestSendMessageInterruptBehaviorQueuesAndInterruptsWhenBusy` |
`should have 1 item(s), but has 2` | Wake from `CreateChat` triggers
`processChat` which auto-promotes a queued message, adding an extra row
to `chat_messages` |
| `TestSubscribeNoPubsubNoDuplicateMessageParts` | `Condition satisfied`
(duplicate events) | Pre-existing `WaitGroup.Add/Wait` race in the
`Eventually` + `WaitUntilIdleForTest` pattern |

## Fix

Introduces a `waitForChatProcessed` helper that:
1. Polls until the chat reaches a **terminal state** (not pending AND
not running)
2. Then calls `WaitUntilIdleForTest` to wait for the inflight
`WaitGroup`

Waiting for a terminal state (not just "not pending") avoids a
`sync.WaitGroup` `Add/Wait` race: `AcquireChats` updates the DB status
to `running` **before** `processOnce` calls `inflight.Add(1)`. Checking
only `status != pending` could return while `Add(1)` hasn't happened
yet, causing `Wait()` to return prematurely.

### Per-test changes

- **`TestSendMessageInterruptBehaviorQueuesAndInterruptsWhenBusy`**:
Call `waitForChatProcessed` after `CreateChat` before manually setting
running status
-
**`TestPromoteQueuedAllowsAlreadyQueuedMessageWhenUsageLimitReached`**:
Call `waitForChatProcessed` after `CreateChat`; remove the inherently
racy `status == pending` assertion after `PromoteQueued` (the wake
immediately acquires the chat). Key assertions on promoted message,
queue state, and message count remain.
- **`TestSubscribeNoPubsubNoDuplicateMessageParts`**: Replace inline
`Eventually` with the safer `waitForChatProcessed` helper

## Verification

All three tests pass 150 consecutive executions with `-race -count=10`
across 15 runs (0 failures).
2026-03-30 18:23:47 +00:00
..