mirror of
https://github.com/coder/coder.git
synced 2026-06-02 20:48:20 +00:00
15f2fa55c6
## Summary Adds a process-wide cache for three hot database queries in `chatd` that were hitting Postgres on **every chat turn** despite returning rarely-changing configuration data: | Query | Before (50k turns) | After | Reduction | |---|---|---|---| | `GetEnabledChatProviders` | ~98.6k calls | ~500-1000 | ~99% | | `GetChatModelConfigByID` | ~49.2k calls | ~500-1000 | ~98% | | `GetUserChatCustomPrompt` | ~46.7k calls | ~1000-2000 | ~97% | These were identified via `coder exp scaletest chat` (5000 concurrent chats × 10 turns) as the dominant source of Postgres load during chat processing. ## Design Follows the established **webpush subscription cache pattern** (`coderd/webpush/webpush.go`): - `sync.RWMutex` + `tailscale.com/util/singleflight` (generic) + generation-based stale prevention + TTL - 10s TTL for provider/model config, 5s TTL for user prompts - Negative caching for `sql.ErrNoRows` on user prompts (the common case — most users don't set custom prompts) - Deep-clones `ChatModelConfig.Options` (`json.RawMessage` = `[]byte`) on both store and read paths ### Invalidation Single pubsub channel (`chat:config_change`) with kind discriminator for cross-replica cache invalidation. Seven publish points in `coderd/chats.go` cover all admin mutation endpoints (create/update/delete for providers and model configs, put for user prompts). _This PR was generated with mux and was reviewed by a human_