mirror of
https://github.com/coder/coder.git
synced 2026-06-02 20:48:20 +00:00
perf: make ListAIBridgeSessions 10x faster (#23774)
_Disclaimer: produced using Claude Opus 4.6, reviewed by me, and validated against Dogfood dataset._ The `ListAIBridgeSessions` query materialized and aggregated all matching interceptions before paginating, then ran expensive token/prompt lookups across the full dataset. For a page of 25 sessions against ~200k interceptions (our dogfood dataset), this meant: - Three CTEs scanning all rows (filtered_interceptions, session_tokens, session_root) - ARRAY_AGG(fi.id) collecting every interception ID per session - Lateral prompt lookup via ANY(array_of_all_ids) running for every session, not just the page - ~90MB of disk sorts and JIT compilation kicking in The improvement is to restructure to paginate first and enrich after: a single CTE groups interceptions into sessions with only cheap aggregates (MIN, MAX, COUNT), applies cursor pagination and LIMIT, then lateral joins fetch metadata, tokens, and prompts for just the ~25-row page. Measured against 220k interceptions / 160k sessions: | Metric | Before | After | |--------------------|--------|-------| | Execution time | 1800ms | 185ms | | Shared buffer hits | 737k | 2.6k | | Disk sort spill | 86MB | 16MB | | Lateral loops | 160k | 25 | https://grafana.dev.coder.com/goto/fbODPGtvR?orgId=1 the results are identical, just _much_ faster. --- Also includes some additional tests which I added prior to refactoring the query to ensure no regressions on edge-cases. --------- Signed-off-by: Danny Kopping <danny@coder.com>
This commit is contained in:
@@ -788,6 +788,10 @@ type sqlcQuerier interface {
|
||||
// Returns paginated sessions with aggregated metadata, token counts, and
|
||||
// the most recent user prompt. A "session" is a logical grouping of
|
||||
// interceptions that share the same session_id (set by the client).
|
||||
//
|
||||
// Pagination-first strategy: identify the page of sessions cheaply via a
|
||||
// single GROUP BY scan, then do expensive lateral joins (tokens, prompts,
|
||||
// first-interception metadata) only for the ~page-size result set.
|
||||
ListAIBridgeSessions(ctx context.Context, arg ListAIBridgeSessionsParams) ([]ListAIBridgeSessionsRow, error)
|
||||
ListAIBridgeTokenUsagesByInterceptionIDs(ctx context.Context, interceptionIds []uuid.UUID) ([]AIBridgeTokenUsage, error)
|
||||
ListAIBridgeToolUsagesByInterceptionIDs(ctx context.Context, interceptionIds []uuid.UUID) ([]AIBridgeToolUsage, error)
|
||||
|
||||
Reference in New Issue
Block a user