fix: resolve idle timeout recording test flake on macOS (#24240)

Fixes https://github.com/coder/internal/issues/1461

Two synchronization issues caused
`TestPortableDesktop_IdleTimeout_StopsRecordings` (and the
`MultipleRecordings` variant) to flake on macOS CI:

1. **`clk.Advance(idleTimeout)` was not awaited.** In
`MultipleRecordings`, both idle timers fire simultaneously but their
`fire()` goroutines race to remove themselves from the mock clock's
event list. Without `MustWait`, the second timer may still be in `m.all`
when the next `Advance` is called, causing `"cannot advance ... beyond
next timer/ticker event in 0s"`.

2. **The test depended on SIGINT being handled promptly.** After the
`stop_timeout` timer was released, the test relied entirely on the shell
process handling SIGINT (via `rec.done`). On macOS, `/bin/sh` may not
interrupt `wait` reliably, leaving `lockedStopRecordingProcess` blocked
in its `select` while holding `p.mu` — deadlocking the
`require.Eventually` callback.

### Fix

Wait for each `Advance` to complete and advance past the 15s stop
timeout so the process is forcibly killed via the timer path,
independent of signal handling.

Verified with 1000 iterations (500 per test) with zero failures.

> Generated with [Coder Agents](https://coder.com/agents)
This commit is contained in:
Kyle Carberry
2026-04-10 14:25:12 -04:00
committed by GitHub
parent a62ead8588
commit 8dff1cbc57
@@ -812,12 +812,18 @@ func TestPortableDesktop_IdleTimeout_StopsRecordings(t *testing.T) {
stopTrap := clk.Trap().NewTimer("agentdesktop", "stop_timeout")
// Advance past idle timeout to trigger the stop-all.
clk.Advance(idleTimeout)
clk.Advance(idleTimeout).MustWait(ctx)
// Wait for the stop timer to be created, then release it.
stopTrap.MustWait(ctx).MustRelease(ctx)
stopTrap.Close()
// Advance past the 15s stop timeout so the process is
// forcibly killed. Without this the test depends on the real
// shell handling SIGINT promptly, which is unreliable on
// macOS CI runners (the flake in #1461).
clk.Advance(15 * time.Second).MustWait(ctx)
// The recording process should now be stopped.
require.Eventually(t, func() bool {
pd.mu.Lock()
@@ -939,11 +945,17 @@ func TestPortableDesktop_IdleTimeout_MultipleRecordings(t *testing.T) {
stopTrap := clk.Trap().NewTimer("agentdesktop", "stop_timeout")
// Advance past idle timeout.
clk.Advance(idleTimeout)
clk.Advance(idleTimeout).MustWait(ctx)
// Wait for both stop timers.
// Each idle monitor goroutine serializes on p.mu, so the
// second stop timer is only created after the first stop
// completes. Advance past the 15s stop timeout after each
// release so the process is forcibly killed instead of
// depending on SIGINT (unreliable on macOS — see #1461).
stopTrap.MustWait(ctx).MustRelease(ctx)
clk.Advance(15 * time.Second).MustWait(ctx)
stopTrap.MustWait(ctx).MustRelease(ctx)
clk.Advance(15 * time.Second).MustWait(ctx)
stopTrap.Close()
// Both recordings should be stopped.