perf: make ListAIBridgeSessions 10x faster (#23774)

_Disclaimer: produced using Claude Opus 4.6, reviewed by me, and
validated against Dogfood dataset._

The `ListAIBridgeSessions` query materialized and aggregated all
matching interceptions before paginating, then ran expensive
token/prompt lookups across the full dataset. For a page of 25 sessions
against ~200k interceptions (our dogfood dataset), this meant:
- Three CTEs scanning all rows (filtered_interceptions, session_tokens,
session_root)
  - ARRAY_AGG(fi.id) collecting every interception ID per session
- Lateral prompt lookup via ANY(array_of_all_ids) running for every
session, not just the page
  - ~90MB of disk sorts and JIT compilation kicking in

The improvement is to restructure to paginate first and enrich after: a
single CTE groups interceptions into sessions with only cheap aggregates
(MIN, MAX, COUNT), applies cursor pagination and LIMIT, then lateral
joins fetch metadata, tokens, and prompts for just the ~25-row page.

  Measured against 220k interceptions / 160k sessions:

  | Metric             | Before | After |
  |--------------------|--------|-------|
  | Execution time     | 1800ms | 185ms |
  | Shared buffer hits | 737k   | 2.6k  |
  | Disk sort spill    | 86MB   | 16MB  |
  | Lateral loops      | 160k   | 25    |

https://grafana.dev.coder.com/goto/fbODPGtvR?orgId=1 the results are
identical, just _much_ faster.

--- 

Also includes some additional tests which I added prior to refactoring
the query to ensure no regressions on edge-cases.

---------

Signed-off-by: Danny Kopping <danny@coder.com>
This commit is contained in:
Danny Kopping
2026-03-31 14:42:23 +02:00
committed by GitHub
parent acd2ff63a7
commit 9fa103929a
5 changed files with 463 additions and 165 deletions
+4
View File
@@ -788,6 +788,10 @@ type sqlcQuerier interface {
// Returns paginated sessions with aggregated metadata, token counts, and
// the most recent user prompt. A "session" is a logical grouping of
// interceptions that share the same session_id (set by the client).
//
// Pagination-first strategy: identify the page of sessions cheaply via a
// single GROUP BY scan, then do expensive lateral joins (tokens, prompts,
// first-interception metadata) only for the ~page-size result set.
ListAIBridgeSessions(ctx context.Context, arg ListAIBridgeSessionsParams) ([]ListAIBridgeSessionsRow, error)
ListAIBridgeTokenUsagesByInterceptionIDs(ctx context.Context, interceptionIds []uuid.UUID) ([]AIBridgeTokenUsage, error)
ListAIBridgeToolUsagesByInterceptionIDs(ctx context.Context, interceptionIds []uuid.UUID) ([]AIBridgeToolUsage, error)