perf(coderd): reduce duplicated reads in push and webpush paths (#23115)

## Background

A 5000-chat scaletest (~50k turns, ~2m45s wall time) completed
successfully,
but the main bottleneck was **DB pool starvation from repeated reads**,
not
individually expensive SQL. The push/webpush path showed a few
especially noisy
reads:

- `GetLastChatMessageByRole` for push body generation
- `GetEnabledChatProviders` + `GetChatModelConfigByID` for push summary
model
  resolution
- `GetWebpushSubscriptionsByUserID` for every webpush dispatch

This PR keeps the optimizations that remove those duplicate reads while
leaving
stream behavior unchanged.

## What changes in this PR

### 1. Reuse resolved chat state for push notifications

`maybeSendPushNotification` used to re-read the last assistant message
and
re-resolve the chat model/provider after `runChat` had already done that
work.

Now `runChat` returns the final assistant text plus the already-resolved
model
and provider keys, and the push goroutine uses that state directly.

That removes the extra push-path reads for:

- `GetLastChatMessageByRole`
- the second `resolveChatModel` path
- the provider/model lookups that came with that second resolution

### 2. Cache webpush subscriptions during dispatch

`Dispatch()` previously hit `GetWebpushSubscriptionsByUserID` on every
push. A
small per-user in-memory cache now avoids those repeated reads.

The follow-up fix keeps that optimization correct: `InvalidateUser()`
bumps a
per-user generation so an older in-flight fetch cannot repopulate the
cache with
pre-mutation data after subscribe/unsubscribe.

That preserves the cache win without letting local subscription changes
be
silently overwritten by stale fetch results.

## Why this is safe

- The push change only reuses data already produced during the same chat
run. It
does not change notification semantics; if there is no assistant text to
  summarize, the existing fallback body still applies.
- The webpush change keeps the existing TTL and `410 Gone` cleanup
behavior. The
generation guard only prevents stale in-flight fetches from poisoning
the
  shared cache after invalidation.
- The final PR does **not** change stream setup, pubsub/relay behavior,
or chat
  status snapshot timing.

## Deliberately not included

- No stream-path optimization in `Subscribe`.
- No inline pubsub message payloads.
- No distributed cross-replica webpush cache invalidation.
This commit is contained in:
Ethan
2026-03-17 13:50:47 +11:00
committed by GitHub
parent 7cca2b6176
commit 04fca84872
6 changed files with 465 additions and 49 deletions
+18 -7
View File
@@ -35,31 +35,42 @@ func TestWebpushSubscribeUnsubscribe(t *testing.T) {
memberClient, _ := coderdtest.CreateAnotherUser(t, client, owner.OrganizationID)
_, anotherMember := coderdtest.CreateAnotherUser(t, client, owner.OrganizationID)
handlerCalled := make(chan bool, 1)
var handlerCalls atomic.Int32
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusCreated)
handlerCalled <- true
handlerCalls.Add(1)
}))
defer server.Close()
err := memberClient.PostWebpushSubscription(ctx, "me", codersdk.WebpushSubscription{
// Seed the dispatcher cache with an empty subscription set. Creating the
// subscription should invalidate that entry so the next dispatch sees the new
// subscription immediately.
err := memberClient.PostTestWebpushMessage(ctx)
require.NoError(t, err, "test webpush message without a subscription")
require.Zero(t, handlerCalls.Load(), "a user without subscriptions should not receive a push")
err = memberClient.PostWebpushSubscription(ctx, "me", codersdk.WebpushSubscription{
Endpoint: server.URL,
AuthKey: validEndpointAuthKey,
P256DHKey: validEndpointP256dhKey,
})
require.NoError(t, err, "create webpush subscription")
require.True(t, <-handlerCalled, "handler should have been called")
require.Equal(t, int32(1), handlerCalls.Load(), "subscription validation should hit the endpoint once")
err = memberClient.PostTestWebpushMessage(ctx)
require.NoError(t, err, "test webpush message")
require.True(t, <-handlerCalled, "handler should have been called again")
require.NoError(t, err, "test webpush message after subscribing")
require.Equal(t, int32(2), handlerCalls.Load(), "the dispatcher should invalidate empty cache entries after subscribing")
err = memberClient.DeleteWebpushSubscription(ctx, "me", codersdk.DeleteWebpushSubscription{
Endpoint: server.URL,
})
require.NoError(t, err, "delete webpush subscription")
// Deleting the subscription for a non-existent endpoint should return a 404
err = memberClient.PostTestWebpushMessage(ctx)
require.NoError(t, err, "test webpush message after unsubscribing")
require.Equal(t, int32(2), handlerCalls.Load(), "the dispatcher should invalidate cached subscriptions after unsubscribing")
// Deleting the subscription for a non-existent endpoint should return a 404.
err = memberClient.DeleteWebpushSubscription(ctx, "me", codersdk.DeleteWebpushSubscription{
Endpoint: server.URL,
})