mirror of
https://github.com/coder/coder.git
synced 2026-06-03 13:08:25 +00:00
8cfb294291
## Flake Fix Resolves https://github.com/coder/internal/issues/1301 `TestAIBridgeListInterceptions/Pagination/offset` flakes with a 500 caused by `runtime error: integer divide by zero` in `pq.ParseTimestamp` (encode.go:430) during `GetAPIKeyByID` in the auth middleware. ### Root Cause **PostgreSQL historical timezone formatting + fragile pq parser:** 1. **Year-0001 timestamps trigger unusual PostgreSQL formatting.** New API keys were initialized with `LastUsed: time.Time{}` (year 0001-01-01). When the PostgreSQL server timezone is non-UTC, it applies historical Local Mean Time (LMT) offsets for pre-1900 dates. For year 0001, this can produce timestamps with seconds in the timezone offset like `0001-12-31 19:03:58-04:56:02`, a format the pq parser was never designed to handle. 2. **The pq parser panics on unexpected formats.** The fractional-seconds parser at encode.go:430 computes `fracOff` via `strings.IndexAny`. When the timestamp has an unusual LMT format, index arithmetic can produce `fracOff ≤ 0`, causing `int(math.Pow(10, float64(negative))) = 0` → divide-by-zero panic. 3. **Why it is intermittent:** CI Postgres instances may have varying timezone configs across runs. The pagination test makes 80+ API calls, each reading `last_used` via `GetAPIKeyByID`, increasing the probability of hitting the edge case. 4. **Ruled out pq race condition.** The decode path copies bytes to a Go string via `string(s)` before `ParseTimestamp`, so buffer reuse cannot corrupt the input. ### Fix Initialize `LastUsed` to `time.Unix(0, 0).UTC()` (Unix epoch, 1970-01-01) instead of `time.Time{}` (year 0001). This avoids the entire class of historical timestamp formatting edge cases. **Why not `dbtime.Now()`?** The auth middleware debounces `LastUsed` updates — it only writes when `now.Sub(key.LastUsed) > time.Hour`. Using `dbtime.Now()` makes the key appear freshly used so the debounce never triggers, breaking `TestPostUsers/LastSeenAt` and `TestUsersFilter/LastSeenBeforeNow`. Unix epoch is always >1 hour in the past, so debounce works correctly. ### Follow-up A defensive fix should also be added to the `coder/pq` fork (guard `fracOff ≤ 0` before the division in `ParseTimestamp`). Other year-0001 sentinel values exist across the codebase (`workspace_builds.deadline`, `users.last_seen_at`, `workspaces.last_used_at`, etc.) and remain theoretically vulnerable until the pq fork is hardened.