coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Danny Kopping	282ab7de34	refactor: load AI providers from the database at startup (#25672 ) Replace the env-based `BuildProviders` with a DB-backed loader. The database is now the single source of truth for runtime provider configuration; env config arrives via `SeedAIProvidersFromEnv` (run at boot) and `BuildProviders` reads it back as `aibridge.Provider` instances. `cli/server.go` and `enterprise/cli/server.go` both call the same path, so aibridged and aibridgeproxyd see the same provider set. Per-provider `DumpDir` is replaced by a top-level `CODER_AI_GATEWAY_DUMP_DIR` base; each provider's effective dump path is `<base>/<provider name>`.	2026-05-26 15:57:01 +02:00
Danny Kopping	4ddda3a9db	feat: filter interceptions and sessions by provider name (#25640 ) Allows filtering sessions & interceptions by provider name, and adds a test to vaidate that provider name is immutable (at least until #25606 lands).	2026-05-25 16:31:48 +02:00
Danny Kopping	0d9718e217	feat: add 'copilot' to ai_provider_type (#25616 )	2026-05-22 16:10:37 +02:00
Michael Suchacz	bdf2698fcd	fix: parse skill frontmatter as YAML (#25610 )	2026-05-22 15:09:30 +02:00
Mathias Fredriksson	0ba702c43f	fix: normalize command paths to base names in shellparse (#25599 ) Normalize program names in shellparse.Parse to their basename. Does not rely on filepath.Base because the server may run on either Linux or Windows where the behavior would differ. Closes CODAGT-470	2026-05-22 13:36:53 +03:00
Ethan	c650aabbef	chore: standardize on _internal_test.go for white-box tests (#25601 ) My agent added `//nolint:testpackage` to a test file on one of my PRs. Again. This PR cleans it up across the entire repo and updates the in-repo conventions so future agents stop doing it. The repo already has a precedent for white-box tests that need to touch unexported symbols: `_internal_test.go` (145+ existing files). The `testpackage` linter's default `skip-regexp` exempts that filename suffix, so the `//nolint:testpackage` directive is unnecessary in every case where someone reached for it. This PR renames 51 such files to `_internal_test.go` via `git mv` so blame and history follow, and strips the dead directive from 2 files that were already correctly named (`coderd/oauth2provider/authorize_internal_test.go`, `coderd/x/chatd/advisor_internal_test.go`). `.claude/docs/TESTING.md` now documents the rule explicitly under Test Package Naming, which is imported into the root `AGENTS.md` via `@.claude/docs/TESTING.md`. The rule: prefer `package foo_test`; if you need internal access, rename the file to `_internal_test.go` rather than adding a nolint directive.	2026-05-22 20:24:38 +10:00
Danny Kopping	c50b0e84b9	feat!: default `CODER_AI_GATEWAY_ENABLED` to true (#25575 ) `CODER_AI_GATEWAY_ENABLED` / `CODER_AIBRIDGE_ENABLED` is now being defaulted to `true` now that it will be used by Coder Agents. If you previously had this value disabled explicitly, that value will persist.	2026-05-22 08:57:36 +02:00
Danny Kopping	9341efec9f	feat!: seed ai_providers from env on server startup (#24895 ) _Disclaimer: implemented by a Coder Agent using Claude Opus 4.7_ Part of the implementation of [RFC: Common AI Provider Configs](https://www.notion.so/coderhq/RFC-Common-AI-Provider-Configs-34bd579be59280ed958feffb82024797) (AIGOV-201). ## Note This change can cause a previously working installation to fail to start should a conflict exist between the providers configured in the environment & those now migrated to the database. I'll raise a PR upstack to document this process and workarounds should a startup fail. ## What this PR does Reconciles environment-derived AI provider configuration with the `ai_providers` table at server startup. The seed runs before the aibridged daemon is initialized, so the runtime always reads providers from the database; the legacy `CODER_AIBRIDGE_` environment variables become a one-shot migration source. ### Behavior - Concurrent server starts are serialized through a Postgres advisory lock (`LockIDAIProvidersEnvSeed`). - Missing rows are inserted with an audit entry attributed to the system actor. - Existing rows whose canonical hash matches the env-derived hash are left alone (the common no-op restart path). - Existing rows whose canonical hash does not* match cause server startup to fail with a descriptive error so the operator can explicitly resolve the conflict in either env or DB. - Soft-deleted rows are NOT resurrected from env; an explicit operator deletion is sticky across restarts. - Indexed providers whose name conflicts with a legacy env var fail startup with a clear remediation message. - Unknown provider types (e.g. `copilot`, until the DB enum is widened) are skipped with a log entry rather than failing startup. ### Canonical hashing The `canonicalAIProvider` shape captures exactly the fields that determine runtime behavior — `type`, `base_url`, and the Bedrock subset of settings (access key, access key secret, region, model, small fast model) — and is hashed with SHA-256. The hash is computed on demand from the row + env, never persisted, so the database does not need a new column for it. API keys live in the separate `ai_provider_keys` table and are intentionally excluded from the hash so operators can rotate keys via the API without forcing a server restart. <details> <summary>Decision log</summary> - The hash is intentionally not persisted in the database. The RFC discussed this trade-off; computing on demand keeps the schema minimal and lets the canonical shape evolve without a migration. - The lock uses an `iota` slot in `coderd/database/lock.go` rather than `GenLockID` so it's stable, easy to audit, and matches the convention used for every other startup lock. - A bearer-token Anthropic provider whose env vars also set Bedrock metadata but no AWS credentials does NOT store the Bedrock fields. Without credentials the discriminated settings would misrepresent the row as Bedrock auth. - We deliberately do NOT publish to the `ai_providers_changed` pubsub channel from the seed because the seed completes before any subscriber is started; the follow-up PR introduces that channel. </details>	2026-05-22 08:37:27 +02:00
Michael Suchacz	06526a5822	feat: use AI provider chat APIs (#25415 )	2026-05-22 07:53:23 +02:00
Kayla はな	10efde3e6c	fix(codersdk): fix stale comment reference (#25552 )	2026-05-21 21:11:11 -06:00
Michael Suchacz	40878eeba4	feat: add AI provider schema expansion (#25412 )	2026-05-22 02:16:01 +02:00
Michael Suchacz	356bccddc2	feat: add personal skills settings UI and docs (#25066 ) > Mux updated this PR on behalf of Mike. ## Summary - Add experimental personal skills API helpers and an Agents settings UI for listing, creating, editing, deleting, and importing SKILL.md content. - Add docs, Storybook coverage, and unit tests for backend-compatible SKILL.md parsing. - Address review feedback by simplifying frontmatter scalar parsing, clarifying the UI parser scope, defaulting personal skill queries to `me`, and patching React Query caches after create, update, and delete. - Merge latest `main` and resolve the Agents sidebar refactor conflicts. ## Validation - pre-commit hook - `go test ./codersdk/workspacesdk -run TestParseSkillFrontmatter -count=1` - `go test ./coderd/x/chatd/chattool -run 'Test' -count=1` - `cd site && pnpm test -- src/pages/AgentsPage/utils/personalSkills.test.ts src/api/queries/userSkills.test.ts src/utils/fileSize.test.ts --runInBand` - `cd site && pnpm lint:types` - `cd site && pnpm lint:check`	2026-05-22 00:20:10 +02:00
Spike Curtis	9998c7499c	test: fix TestTunneler_Integration line endings on Windows (#25584 ) fixes https://github.com/coder/internal/issues/1542 Drop line endings before test assertion to make it more cross-platform.	2026-05-21 12:26:54 -04:00
Zach	ddc0e99c69	chore: remove coder_secret Terraform integration (#25512 ) Removes the coder_secret Terraform integration: the data.coder_secret consumption path through provisionerdserver → provisioner.proto → provisioner/terraform, the dynamic-parameter secret-requirement validation, and the workspace-update / resolve-autostart surfaces that depended on it. This is being done due to a product/feature direction change (see PLAT-243). User-secret CRUD (DB, REST, CLI, UI, telemetry, audit) and the agent-manifest secret-injection path are untouched. The provisionerd API is bumped from v1.17 to v1.18 rather than rolled back: v1.17 shipped in v2.33.x, so user_secrets field numbers are reserved and the changelog documents both versions. Generated with assistance from Coder Agents.	2026-05-21 09:19:29 -06:00
Paweł Banaszewski	46e93e6325	chore: add ai_gateway options that alias aibridge options (#25061 ) Adds options matching new AI Gateway naming. New options are added as alias for old options. Old options are still working. Old options have deprecated message. No conflict detection was added. Updated documentation so it mentions only new options. Added note about old options still working. > Various AI tools where used to create this PR	2026-05-21 11:14:11 +02:00
Mathias Fredriksson	f1b772928d	feat: parse execute tool commands and render them in the chat UI (#25478 ) When the execute tool runs a chained shell command, the UI previously rendered the raw string. Long chains like "cd /repo && git pull && git add . && git commit -m fix" were hard to scan. A new ChatMessagePart.ParsedCommands [][]string field on tool-call parts carries one entry per simple command, parsed in chatd from args via mvdan.cc/sh/v3/syntax. The frontend renders the joined list ("cd, git pull, git add, git commit") in place of the raw command, and falls back to the raw command when the field is absent. Closes CODAGT-446	2026-05-21 08:12:34 +00:00
Steven Masley	9b6eadab77	fix: drop N+1 db query on template ACL available (#25465 ) Fixes [PLAT-149](https://linear.app/codercom/issue/PLAT-149/template-permissions-search-is-extremely-slow-with-many-groups). `/acl/available` ran a db query per group. A deployment with >5,000 groups made this route extremely slow.	2026-05-20 22:40:50 +00:00
Spike Curtis	8dc4d76890	chore: add agent-connection-watch for workspaces (#24507 ) <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. --> relates to GRU-18 Adds basic implementation for Workspace Agent Connection Watch and tests. Missing are handling of logs.	2026-05-20 13:09:11 -04:00
Danielle Maywood	96e3c49670	feat: add chat sharing API (#24968 )	2026-05-20 10:46:35 +01:00
Danny Kopping	dd3223451b	feat: add AI providers HTTP CRUD handlers (#24894 )	2026-05-20 10:21:36 +02:00
Mathias Fredriksson	1ddc89caa2	test(codersdk/toolsdk): scope err in SendTaskInput and GetTaskLogs subtests (#25434 ) Fixes coder/internal#1475 Fixes CODAGT-364	2026-05-20 11:02:46 +03:00
Michael Suchacz	5a8d0016a5	feat: add personal skill storage, API, and SDK (#25363 ) > Mux updated this PR on behalf of Mike. ## Stack Context This PR is the storage, permissions, API, and SDK layer for experimental personal skills. #25362 has landed on `main`, so this branch is restacked directly on `main`. Stack order: 1. #25363 storage, permissions, API, and SDK 2. #25365 API test coverage 3. #25366 chattool and chatd integration 4. #25066 settings UI and docs 5. #25386 personal skills slash menu ## What? Adds the `user_skills` database table, generated queries, RBAC resources and scopes, audit resource handling, experimental user-scoped CRUD endpoints, SDK types, and generated API/site types. Follow-up review and restack fixes: - Enforce a bounded personal skill description in parser and database constraints. - Return `403 Forbidden` for unauthorized create and update attempts. - Return explicit conflict responses when soft-deleted users are targeted. - Keep user admins out of personal skills, while site owners can read and delete but not create or update. - Document trigger-raised constraint names and keep schema constants covered by tests. - Reuse `UserSkillMetadata` in the full `UserSkill` SDK response type. - Generate user skill IDs in Go instead of relying on a database default. - Rebase on latest `main` and renumber the user skills migration to `000502_user_skills`. ## Why? Personal skills need durable user-owned storage with owner authorization, limited site-owner moderation, and a hidden API surface before chatd can consume them. ## Validation - `make gen` - `go test ./coderd/database -run '^TestUserSkillSchemaConstants$' -count=1` - `go test ./coderd/database/dbauthz -run '^TestMethodTestSuite/TestUserSkills$' -count=1` - `go test ./coderd -run '^TestPatchUserSkill$' -count=1` - `go test ./codersdk ./coderd/database/db2sdk` - `make lint` - pre-commit hook on `97fd58108d`	2026-05-20 00:09:09 +02:00
dylanhuff-at-coder	441854daa8	feat: add user secrets client utilities (#25370 ) Add frontend API methods, mocks, and form helpers for user secrets CRUD. The new client methods cover list, get, create, update, and delete requests, including URL encoding for secret names used in route paths. Add user secret form utilities for create and update payload construction, required create field checks, and structured API validation error mapping back to form fields. User secret name validation now lives in codersdk with tests, and coderd returns field-level validation errors for create, update, and uniqueness conflicts so the frontend can show backend-owned validation results consistently.	2026-05-19 09:30:31 -07:00
Cian Johnston	ce7f41f56d	fix: bump MaxChatFileIDs from 20 to 50 (#25492 ) Fixes CODAGT-456	2026-05-19 16:53:30 +01:00
Steven Masley	51b531f5b3	chore: 'go generate' mockgen to use `go tool` wrapper (#25490 ) Calling `mockgen` relies on the executable in the `$PATH`. Using `go tool` uses the one defined in `go.mod`	2026-05-19 14:53:13 +00:00
Steven Masley	1afc6d4fd0	feat: structured disconnect attribution for agent logs (#25191 ) Implements [PLAT-60](https://linear.app/codercom/issue/PLAT-60/enhance-disconnect-logs-with-structured-reason-attribution): adds structured disconnect attribution to disconnect logs throughout the agent and tailnet packages. Every disconnect log site now carries structured slog fields. All existing logs remain; existing messages are preserved with the fields added alongside. New fields on disconnect log lines: - `connect_type` — which layer disconnected: `server_to_agent`, `agent_to_client`, or `client_to_server` - `disconnect_reason` — categorical reason: `graceful`, `network_error`, `server_shutdown`, etc. - `disconnect_expected` — whether the disconnect is normal operation (`true`) or should be investigated (`false`) - `disconnect_initiator` — who started it: `client`, `agent`, `server`, or `network` (control-plane sites only) - `disconnect_detail` — free-form supplemental info (where useful) ## What's covered Control plane (`server_to_agent`): coordination RPC, DERP map subscriber, agent runLoop, agent Close, `BasicCoordination.Close`, `Controller.run`. Data plane (`agent_to_client`): SSH sessions, reconnecting PTY, JetBrains port-forwarding. <details> <summary>Control-plane sites</summary> \| Site \| Reason \| Initiator \| \|---\|---\|---\| \| `agent/agent.go` `runLoop` EOF \| `network_error` \| `network` \| \| `agent/agent.go` `runCoordinator` deferred exit \| `server_shutdown` / `graceful` / `network_error` \| `agent` / `server` / `network` \| \| `agent/agent.go` `runDERPMapSubscriber` deferred exit \| same (shared `classifyCoordinatorRPCExit`) \| same \| \| `agent/agent.go` `Close` shutdown timeout \| `server_shutdown` + detail \| `agent` \| \| `agent/agent.go` `Close` clean coord disconnect \| `server_shutdown` \| `agent` \| \| `tailnet/controllers.go` `BasicCoordination.Close` \| `graceful` or `network_error` \| `c.initiator` \| \| `tailnet/controllers.go` `Controller.run` `net.ErrClosed` \| `network_error` \| `network` \| </details> <details> <summary>Data-plane sites</summary> \| Site \| Reason \| Notes \| \|---\|---\|---\| \| `agent/agentssh/agentssh.go` SSH session closed \| free-form (`graceful`, `process exited with error status: N`, etc.) \| Also sets `closeCause("normal exit")` for clean exits so coderd's `connection_log.DisconnectReason` is no longer empty \| \| `agent/reconnectingpty/server.go` PTY closed \| `server_shutdown`, error string, or `graceful` \| \| \| `agent/agentssh/jetbrainstrack.go` channel closed \| `normal close` or error string \| Previously passed empty reason \| </details> <details> <summary>Bug fix</summary> The deferred `disconnected from coordination RPC` log no longer fires when the initial `Coordinate()` RPC call fails before any connection is established. </details> Refs PLAT-60. --- _This PR was prepared by Coder Agents on behalf of @Emyrk._ Manually QA'd a lot of common disconnects --------- Co-authored-by: Coder Agents <noreply@coder.com>	2026-05-19 09:47:03 -05:00
Danielle Maywood	170a6e1fe9	feat: add chat sharing foundation (#25041 )	2026-05-18 22:32:05 +01:00
Yevhenii Shcherbina	2732378da2	feat: audit group AI budget mutations (#25374 ) Relates to https://linear.app/codercom/issue/AIGOV-284/add-group-budgets-table-and-crud-api Adds audit-log support for `group_ai_budget` mutations. Without it, an admin could silently lower a spend limit from `$500` to `$50` or delete a budget entirely, with no record of who performed the action. Both write (`create-or-update`) and delete actions now produce audit log entries, including before/after diffs for `spend_limit_micros`. Depends on #25203. ## Old Version <img width="1340" height="456" alt="image" src="https://github.com/user-attachments/assets/e9ff52fb-a905-4aef-a4ee-7cdc58e68b75" /> ## New Version (see https://github.com/coder/coder/pull/25374/changes/9d22833de87cc106c24142c1d471a3f71872bf67) <img width="1347" height="496" alt="image" src="https://github.com/user-attachments/assets/1b9bbfa1-f86d-48e3-a0b1-266eb76f851f" />	2026-05-18 15:17:20 -04:00
Kyle Carberry	385146000b	feat: record created_at/completed_at on reasoning ChatMessageParts (#24789 ) Records reasoning start and end times on persisted reasoning `ChatMessagePart`s so reasoning duration can be computed for stored chats. Backend-only: no SSE changes and no frontend rendering ship in this PR. The `created_at` field on `ChatMessagePart` is extended to also be present on `reasoning` parts (it previously appeared only on `tool-call` and `tool-result`), and a new `completed_at` field is added for `reasoning` parts. ### How timestamps are recorded - `StreamPartTypeReasoningStart`: stamp `startedAt = dbtime.Now()` on the active reasoning state. - `StreamPartTypeReasoningEnd`: stamp `completedAt = dbtime.Now()` and append both into parallel `[]time.Time` slices on `stepResult`. - Persistence reads the slices in occurrence order (reasoning has no provider-side ID) and applies them to the matching `ChatMessagePart` via `buildAssistantPartsForPersist`. The first reasoning block's stamps go onto the first reasoning part, and so on. - `flushActiveState` flushes partial reasoning interrupted before `StreamPartTypeReasoningEnd` with `startedAt` from the active state and `completedAt = dbtime.Now()` at the interruption. ### Why two fields, not one? Tool calls and results are point events. The frontend computes their duration by subtracting the call's `created_at` from the result's `created_at`. Reasoning is one assistant part that brackets a span, so we record both endpoints on the part itself. ### Why not stamp in `PartFromContent`? Same rationale as #24101: `PartFromContent` is called during both SSE publishing and persistence. Stamping there would yield incorrect persistence-time timestamps for reasoning blocks that finished much earlier in the step. Instead we capture in the chatloop and apply during persistence. <details><summary>Implementation plan</summary> - `codersdk/chats.go`: extend `CreatedAt`'s `variants` to include `reasoning?`; add `CompletedAt *time.Time` with `variants:"reasoning?"`. - `coderd/x/chatd/chatloop/chatloop.go`: extend `reasoningState` with `startedAt`; extend `stepResult` and `PersistedStep` with parallel `[]time.Time` reasoning slices; stamp on `ReasoningStart`/`ReasoningEnd`; thread the slices through all `PersistStep` call sites including the interrupt-safe path; record partial reasoning in `flushActiveState`. - `coderd/x/chatd/attachments.go`: walk reasoning parts in occurrence order and apply `step.ReasoningStartedAt[i]` to `part.CreatedAt` and `step.ReasoningCompletedAt[i]` to `part.CompletedAt`. ### Tests - `codersdk/chats_test.go` round-trips `created_at` + `completed_at` on reasoning parts and verifies omission when absent and partial interrupted parts. - `coderd/x/chatd/chatprompt/chatprompt_test.go` asserts `PartFromContent(ReasoningContent{})` does NOT stamp timestamps. - `coderd/x/chatd/chatloop/chatloop_test.go` `TestRun_ReasoningTimestamps` drives a stream with two reasoning blocks and verifies parallel slices, monotonicity, ordering, non-zero values, and content-block ordering. `TestRun_InterruptedReasoningFlushesTimestamps` cancels mid-reasoning and verifies `flushActiveState` records a non-zero pair. - `coderd/x/chatd/attachments_test.go` covers `buildAssistantPartsForPersist` for normal interleaved reasoning, partial (zero `completed_at`), and missing slices. </details> > Generated by Coder Agents. Co-authored-by: Coder Agent <agent@coder.com>	2026-05-18 12:30:30 -04:00
Danny Kopping	c69dd9c5dc	feat: widen `ai_provider_type` enum for chatd providers (#25394 )	2026-05-18 15:06:30 +02:00
Garrett Delfosse	78d4cf9e47	fix: soft-delete stale workspace agents on new build (#25207 )	2026-05-18 08:33:29 -04:00
Danny Kopping	0770428a5c	feat: add AIProvider types and client methods (#24893 )	2026-05-18 11:10:30 +02:00
Michael Suchacz	792f0b4902	feat: add personal skill resolver (#25362 ) > Mux updated this PR on behalf of Mike. ## Stack Context This stack splits experimental personal skills into smaller reviewable PRs. Personal skills are user-owned `SKILL.md` files stored by Coder and injected into chatd alongside workspace skills. Stack order: 1. #25362 personal skill resolver 2. #25363 storage, permissions, API, and SDK 3. #25365 API test coverage 4. #25366 chattool and chatd integration 5. #25066 settings UI and docs 6. #25386 personal skills slash menu ## What? Adds the shared personal skill parser and resolver package, plus reusable skill-name validation exported from `workspacesdk`. The parser enforces the full personal skill contract: max raw size, kebab-case name, max name length, and non-empty body. ## Why? The rest of the stack needs one source-aware resolver for personal and workspace skills, including collision handling and qualified aliases. Keeping personal skill constraints in the parser prevents callers from accidentally parsing invalid personal skills. ## Validation - `go test ./coderd/x/skills ./codersdk/workspacesdk` - pre-commit hooks on this branch	2026-05-16 15:33:43 +00:00
Yevhenii Shcherbina	238968cfa0	feat: add per-group AI budget table and endpoints (#25203 ) Closes https://linear.app/codercom/issue/AIGOV-284/add-group-budgets-table-and-crud-api ## Summary Adds the `group_ai_budgets` table and the following endpoints: - `GET /api/v2/groups/{group}/ai/budget` - `PUT /api/v2/groups/{group}/ai/budget` - `DELETE /api/v2/groups/{group}/ai/budget` Each group may have at most one budget row. If no row exists, no budget is enforced. ### Feature gate Added `RequireFeatureMW(FeatureAIBridge)` on the `/ai/budget` sub-route. ## RBAC Authorization reuses `rbac.ResourceGroup` with the existing `.InOrganization(...).WithID(...)` scoping model. The `dbauthz` wrappers load the parent `groups` row and authorize against it. No new resource type is introduced. As a result, anyone with `group:update` permissions (Owner, OrgAdmin, or UserAdmin within the organization) can manage AI budgets for that group. ## Read access for group members `database.Group.RBACObject()` grants `policy.ActionRead` to all members of the group through the group ACL: ```go func (g Group) RBACObject() rbac.Object { return rbac.ResourceGroup.WithID(g.ID). InOrg(g.OrganizationID). // Group members can read the group. WithGroupACL(map[string][]policy.Action{ g.ID.String(): { policy.ActionRead, }, }) } ``` Because the `GET` endpoint authorizes against the same loaded `Group` object, any group member can call: ```text GET /api/v2/groups/{group}/ai/budget ``` `PUT` and `DELETE` remain admin-only. The group ACL grants only `ActionRead`, so write operations continue to require role-based `group:update` permissions. ## Alternative considered A dedicated `rbac.ResourceGroupAiBudget` resource would allow budget management to be separated from general group administration. We decided not to add that complexity for now.	2026-05-14 15:54:37 -04:00
Danny Kopping	841b777ccd	feat: add ai_providers table, queries, dbauthz, audit, RBAC (#24892 )	2026-05-14 16:10:46 +02:00
Danielle Maywood	25a803221e	feat: add shell tool display mode preference (#25029 )	2026-05-14 14:25:07 +01:00
Michael Suchacz	cb37047dce	feat: dedicated /prompts endpoint for chat history cycle (#25083 ) Follow-up to #25004. The merged change cycles only through messages already loaded in the in-memory chat store (page size 50). Long chats and chats whose oldest turns have rolled out of the page lose access to their earlier prompts in the composer's up/down arrow cycle. This PR adds a dedicated server endpoint that returns the full prompt history, newest first, and rewires the composer to use it. ## What changed ### Endpoint `GET /api/experimental/chats/{chat}/prompts?limit=N` ```go type ChatPrompt struct { ID int64; Text string } type ChatPromptsResponse struct { Prompts []ChatPrompt } ``` - `limit`: `0..2000`. `0` (the default) is treated as the server-side default of 500; out-of-range values return `400`. Negative values are rejected by the SDK's `PositiveInt32` parser before reaching the handler. - Auth: parent-chat read in `dbauthz`, mirroring `GetChatMessagesByChatID`. - The SQL filters `role='user'`, `deleted=false`, `visibility IN ('user','both')`, guards the lateral with `jsonb_typeof(content) = 'array'` so legacy V0 scalar-string rows are silently skipped, then unrolls `content` JSONB with `WITH ORDINALITY` and concatenates only `type='text'` parts in original order via `string_agg(... ORDER BY ordinality)`. Messages whose joined text is whitespace-only are dropped via `HAVING ... ~ '\S'` so cycling never lands on a blank entry. ### Partial index (migration `000494`) ```sql CREATE INDEX idx_chat_messages_user_prompts ON chat_messages (chat_id, id DESC) WHERE deleted = false AND role = 'user' AND visibility IN ('user', 'both'); ``` The partial WHERE matches the query's filter exactly and the key order matches `ORDER BY id DESC`, so the planner gets both the filter and the ordering from the index without a sort step. `EXPLAIN ANALYZE` on a synthetic 51-chat × 5,000-message dataset (≈260k rows, 10k user prompts in the target chat, `random_page_cost=1.1`): \| \| Plan \| Buffers hit \| Time \| \|---\|---\|---\|---\| \| Without index \| `Index Scan Backward using chat_messages_pkey`, 250,848 rows removed by filter \| 6,683 \| 32.4 ms \| \| With index \| `Index Scan using idx_chat_messages_user_prompts`, no filter \| 38 \| 1.3 ms \| ≈25× faster, 175× fewer buffer hits. ### Frontend - `chatPromptsKey` / `chatPromptsQuery` factories in `site/src/api/queries/chats.ts` (`staleTime: 30s`, `enabled: chatId !== ""`, asks the server for 500 prompts). - `ChatPageContent.tsx` replaces the in-memory derivation with `useQuery(chatPromptsQuery(chatId ?? ""))`. The composer's existing `cycleHistorySnapshotRef` anchors the in-flight cycle so a refetch arriving mid-cycle cannot shift the indexed prompt out from under the user. - `getEditableUserMessagePayload` now concatenates user-message text parts verbatim, mirroring the server's `string_agg(part->>'text', '' ORDER BY ordinality)`, instead of routing through the streaming-oriented `parseMessageContent` / `appendText` pipeline (which drops whitespace-only chunks — correct for assistant streams, wrong for a user's persisted message). This keeps the cycle and the edit path in agreement on the same message. File blocks are still pulled separately via `parseMessageContent(...).blocks.filter(isEditableUserMessageFileBlock)`. - Cache invalidation in `createChatMessage.onSuccess`, `editChatMessage.onSettled`, and `useChatStore.upsertCacheMessages` (only when an upserted message has `role === "user"`). - Page-level stories pre-seed `chatPromptsKey(CHAT_ID)` from the same `messagesData` to keep them offline. ## Tests - New `TestGetChatUserPrompts` in `coderd/exp_chats_test.go` with five subtests: - `NewestFirstFiltering` — multi-part concatenation, non-text parts skipped, whitespace-only filtered, soft-deleted excluded, `model`-only visibility excluded, assistant-role excluded by `cm.role = 'user'`, legacy V0 scalar row silently excluded by the `jsonb_typeof` guard, ordering newest first. - `LimitClampsResults` — explicit `limit=2` returns the two newest prompts. - `InvalidLimitRejected` — `limit=5000` is `400 Bad Request`. - `NotFoundForOtherUsers` — a separate user in the same org gets `404`, not the prompts. - `EmptyResultIsJSONArray` — zero-message chat and assistant-only chat both return `Prompts: []` (non-nil, empty). - New unit test in `messageParsing.test.ts` asserting that `getEditableUserMessagePayload(["hello", " ", "world"])` returns `"hello world"`, locking in the agreement with the SQL `string_agg`. - `dbauthz_test.go` adds the `MethodTestSuite.TestChats/GetChatUserPromptsByChatID` entry, asserting parent-chat `policy.ActionRead`. - `pnpm test src/pages/AgentsPage` — 1159 passed, 2 skipped. - `make gen` produces no diff. ## Manual verification Seeded a dev chat with Claude Sonnet 4.6 via the aibridge Anthropic provider and posted 20 user prompts end-to-end. Verified that the `/prompts` endpoint returns 20 rows newest-first, that `limit=10` clamps correctly, that `limit=0` uses the server default of 500, and that the up/down keyboard cycle in the composer walks the same sequence (and reverses correctly back to the empty draft). ## Out of scope - Cross-chat history. - Per-user opt-out for the cycle. - File-reference / attachment cycling — the cycle continues to reproduce plain text only, by design. <details> <summary>Implementation plan</summary> # CODAGT-319 Follow-up — Dedicated `/prompts` endpoint ## Context The merged feature ([#25004](https://github.com/coder/coder/pull/25004) / [`d32842f`](https://github.com/coder/coder/commit/d32842f)) cycles only through messages already loaded in the in-memory chat store, which is capped at the first 50 messages of the current page. Long chats and chats whose oldest turns have rolled out of the page can no longer recall their full prompt history. This follow-up exposes a dedicated server endpoint that returns the user-authored prompts in a chat, newest first, and rewires the composer to use it. ## Design ### Endpoint `GET /api/experimental/chats/{chat}/prompts?limit=N` Returns: ```go type ChatPrompt struct { ID int64 Text string } type ChatPromptsResponse struct { Prompts []ChatPrompt } ``` - `limit`: `0..2000`. `0` (the default) → server-side default of 500. The wire-level default is encoded in SQL as `COALESCE(NULLIF($limit, 0), 500)`. Negatives are rejected upstream by `PositiveInt32`; the handler only caps the upper bound. - Auth: parent-chat read in `dbauthz`, mirroring `GetChatMessagesByChatID`. - Listed under the experimental router so we can iterate without API guarantees. ### SQL The query lives in `coderd/database/queries/chats.sql` as `GetChatUserPromptsByChatID`: - Filters `role='user'`, `deleted=false`, `visibility IN ('user','both')` to mirror the composer's "what the user actually typed and can re-send" contract. - Guards the lateral with `jsonb_typeof(content) = 'array'` so legacy V0 rows whose content is a scalar JSON string (predates migration `000434`) are silently excluded instead of raising `"cannot extract elements from a scalar"`. - Unrolls `content` JSONB with `jsonb_array_elements WITH ORDINALITY` and concatenates only `type='text'` parts, preserving original order via `string_agg(... ORDER BY ordinality)`. - Casts the result to `text` so sqlc emits a `string` field instead of `[]byte`. - Drops whitespace-only prompts via `HAVING string_agg(...) ~ '\S'` so cycling never lands on a blank entry. - Orders by `cm.id DESC` (`id` is a sequence, so this is "newest first" without relying on `created_at`). ### Index New partial index added in migration `000494`: ```sql CREATE INDEX idx_chat_messages_user_prompts ON chat_messages (chat_id, id DESC) WHERE deleted = false AND role = 'user' AND visibility IN ('user', 'both'); ``` The partial WHERE clause matches the query's filter exactly, so the planner can use the index for both filtering and ordering without a sort step. ### Frontend - `chatPromptsKey(chatId)` and `chatPromptsQuery(chatId)` factories in `site/src/api/queries/chats.ts`. `staleTime: 30s`, `enabled: chatId !== ""`. Asks the server for 500 prompts (well below the 2000 max, plenty for the cycle). - `ChatPageContent.tsx` replaces the in-memory derivation with `useQuery(chatPromptsQuery(chatId ?? ""))`. The composer's `cycleHistorySnapshotRef` already takes a stable snapshot at cycle entry, so a refetch arriving mid-cycle cannot shift the indexed prompt out from under the user. - `getEditableUserMessagePayload` extracts the edit-path text from raw user-message parts (filter `type === "text"`, join verbatim) instead of going through `parseMessageContent` / `appendText`, which is built for assistant streams and intentionally drops whitespace-only chunks. Without this, cycling and clicking Edit on the same message could produce different draft text for messages with whitespace-only interleaved text parts. - Cache invalidation: `createChatMessage.onSuccess`, `editChatMessage.onSettled`, and `useChatStore.upsertCacheMessages` (when at least one upserted message has `role === "user"`) all invalidate `chatPromptsKey(chatId)`. ### Tests - `TestGetChatUserPrompts` (`coderd/exp_chats_test.go`) covers: - `NewestFirstFiltering` — multi-part concatenation, non-text parts skipped, whitespace-only filtered, soft-deleted excluded, `model`-only visibility excluded, assistant-role excluded by `cm.role = 'user'`, legacy V0 scalar row silently excluded by the `jsonb_typeof` guard, ordering newest first. - `LimitClampsResults` — explicit `limit=2` returns the two newest prompts. - `InvalidLimitRejected` — `limit=5000` is `400 Bad Request`. - `NotFoundForOtherUsers` — a separate user in the same org gets `404`, not the prompts. - `EmptyResultIsJSONArray` — zero-message chat and assistant-only chat both return `Prompts: []` (non-nil, empty). - `messageParsing.test.ts` adds a unit test asserting that `getEditableUserMessagePayload(["hello", " ", "world"])` returns `"hello world"`, locking in the agreement with the SQL `string_agg`. - `dbauthz_test.go` adds the `MethodTestSuite.TestChats/GetChatUserPromptsByChatID` entry, asserting the parent-chat `policy.ActionRead`. ## Out of scope - Cross-chat history. - Per-user opt-out for the cycle. - File-reference / attachment cycling — the cycle still reproduces plain text only, by design. </details> <details> <summary>coder-agents-review history</summary> Four review rounds, eight unique findings, all addressed in this PR (approved twice). Rebased onto `main` twice after R4: first to pick up new migrations `000491` / `000492`, then again for `000493_idx_chat_diff_statuses_url_lower`. The prompts-index migration was renumbered `000491 → 000493 → 000494` via `coderd/database/migrations/fix_migration_numbers.sh`; no other diff changes. \| Round \| Head \| Outcome \| \|---\|---\|---\| \| R1 \| `725422ab` \| `COMMENTED` — 7 findings (DEREM-1..7) \| \| R2 \| `ab2a8936` \| `COMMENTED` — 1 new (DEREM-10) + 1 reraised (DEREM-5) \| \| R3 \| `648c5d1f` \| `APPROVED` — 7 fixed, DEREM-5 deferred via #25125 \| \| R4 \| `93b6f450` \| `APPROVED` — DEREM-5 also fixed in-PR, #25125 closed \| \| ID \| Where \| Resolution \| \|---\|---\|---\| \| DEREM-1 \| `chats.sql` \| Added `jsonb_typeof(content) = 'array'` guard against V0 scalar rows \| \| DEREM-2 \| `exp_chats.go` \| Removed dead `limit < 0` branch (SDK rejects upstream) \| \| DEREM-3 \| `useChatStore.ts` \| Rewrote misleading invalidation comment \| \| DEREM-4 \| `exp_chats_test.go` \| `NewestFirstFiltering` now inserts an assistant-role message so the `role='user'` filter is exercised end-to-end \| \| DEREM-5 \| `messageParsing.ts` \| Rewrote `getEditableUserMessagePayload` to concatenate text parts verbatim, mirroring the SQL `string_agg` \| \| DEREM-6 \| `exp_chats.go` \| Tightened swagger doc + error message to spell out the 0–2000 range \| \| DEREM-7 \| `exp_chats_test.go` \| Added `EmptyResultIsJSONArray` subtest \| \| DEREM-10 \| `exp_chats_test.go` \| `NewestFirstFiltering` now inserts a raw V0 scalar-content row; verified locally that removing the guard makes the test fail \| </details> --- This PR was created on behalf of @ibetitsmike by Coder Agents.	2026-05-14 12:43:12 +02:00
Jaayden Halko	024132e8a4	feat: add theme_mode, theme_light, theme_dark to UserAppearanceSettings (#25076 ) Part 1: Backend portion of a change broken into 2 PRs. Part 2: #25077 Adds three new UserAppearanceSettings fields (theme_mode, theme_light, theme_dark) on top of the existing theme_preference and terminal_font. Replaces GetUserThemePreference and GetUserTerminalFont with a single GetUserAppearanceSettings aggregate query. The PUT handler is wrapped in db.InTx so sync-mode's mode + slot writes can never half-apply.	2026-05-14 05:44:05 +01:00
Zach	e0be9bf213	feat: surface missing coder_secret requirements on resolve-autostart (#25081 ) Adds `dynamicparameters.EvaluateSecretMismatch` as a shared helper on top of the existing renderer, then wires it into the resolve-autostart handler so the UI can surface unsatisfied `coder_secret` requirements in a template alongside parameter mismatch for autostart. The lifecycle executor changes will land in a follow-up that depend on this helper. The UI changes that consume the new `secret_mismatch` field is also a follow-up. Generated with assistance from Coder Agents.	2026-05-13 14:20:02 -06:00
Yevhenii Shcherbina	b5e1ea33d8	feat: add AI budget policy and period deployment config (#25122 ) Closes https://linear.app/codercom/issue/AIGOV-283/add-deployment-config-for-ai-budget-policy-and-period Adds `CODER_AI_BUDGET_POLICY` and `CODER_AI_BUDGET_PERIOD` deployment options for AI Governance cost controls.	2026-05-12 10:48:36 -04:00
Kyle Carberry	b0b07536fc	feat: add opt-in Coder identity headers for MCP servers (#25153 )	2026-05-12 08:54:53 -04:00
Michael Suchacz	f1d160c7f4	fix: allow changing model when editing earlier chat message (#25084 ) Editing a previous user message and selecting a different model in the picker silently kept using the original model: the selection was dropped on the frontend, in the SDK, and in the backend, so both the replacement user message and the assistant turn that followed ran against the old model. Plumb the selected model through all three layers (`AgentChatPage`, `codersdk.EditChatMessageRequest`, `chatd.EditMessageOptions` / `Server.EditMessage`), defaulting to the original message's model when the client does not specify one. The existing `InsertChatMessages` CTE already advances `chats.last_model_config_id` when the inserted message's model differs, so the assistant turn picks up the new selection without further changes. The new model is validated inside the transaction, so an unknown ID rolls the edit back and returns a 400 `Invalid model config ID.`, mirroring the `SendMessage` path. Refs: CODAGT-345 This change was generated by a Coder agent. <details> <summary>Implementation plan</summary> # CODAGT-345: Editing an earlier message cannot change model ## Problem When editing a previous user message in a chat, the user can change the model in the model picker, but the backend keeps using the original message's model. The model selection is dropped at three layers: 1. Frontend: `AgentChatPage.tsx`'s edit branch builds an `EditChatMessageRequest` that omits `model_config_id`. The new-message branch (a few lines below) does include it. 2. SDK: `codersdk.EditChatMessageRequest` has no `ModelConfigID` field at all. 3. Backend: `chatd.EditMessageOptions` has no model field, and `Server.EditMessage` always copies the original message's `ModelConfigID` into the replacement message. Once the replacement user message is inserted with the original model, the `InsertChatMessages` CTE leaves `chats.last_model_config_id` unchanged, so the assistant turn that follows runs against the old model. ## Fix Plumb the selected model through all three layers, defaulting to the original message's model when the client doesn't override it. This mirrors the `SendMessage` path, which already accepts a `model_config_id` and validates it via `resolveSendMessageModelConfigID`. ### Backend - `codersdk/chats.go`: add `ModelConfigID *uuid.UUID` to `EditChatMessageRequest`. - `coderd/x/chatd/chatd.go`: - Add `ModelConfigID uuid.UUID` to `EditMessageOptions`. - In `EditMessage`, after fetching the edited message, resolve the model: if `opts.ModelConfigID != uuid.Nil`, validate it exists with `tx.GetChatModelConfigByID` (using `chatdModelConfigLookupContext`), otherwise keep `editedMsg.ModelConfigID.UUID`. Pass the resolved ID into `newChatMessage(...)`. - Reuse the existing `ErrInvalidModelConfigID` sentinel. - `coderd/exp_chats.go` (`patchChatMessage`): - Read `req.ModelConfigID` (nil-safe), pass into `chatd.EditMessageOptions`. - Add a `case xerrors.Is(editErr, chatd.ErrInvalidModelConfigID)` arm returning 400 `Invalid model config ID.`, matching the `postChatMessages` handler. ### Frontend - `site/src/pages/AgentsPage/AgentChatPage.tsx`: - In the edit branch, set `model_config_id: effectiveSelectedModel \|\| undefined` on the `EditChatMessageRequest`. - On success, persist the chosen model to `lastModelConfigIDStorageKey` so the next chat from this browser keeps the same default. Mirrors the new-message branch. ### Generated - `make site/src/api/typesGenerated.ts` and `make coderd/apidoc/swagger.json` produce the updated `EditChatMessageRequest` schema in `typesGenerated.ts`, `coderd/apidoc/{docs.go,swagger.json}`, and `docs/reference/api/{chats.md,schemas.md}`. ## Tests - `coderd/x/chatd/chatd_test.go`: - `TestEditMessageWithModelConfigOverride`: edit with a different model -> replacement message and `chats.LastModelConfigID` use the new model. - `TestEditMessagePreservesModelConfigByDefault`: edit without `ModelConfigID` -> original model preserved. - `TestEditMessageRejectsUnknownModelConfig`: passes a random UUID -> `ErrInvalidModelConfigID`, original message still present, `LastModelConfigID` unchanged (rollback). - `coderd/exp_chats_test.go` (under `TestPatchChatMessage`): - `ChangesModel`: end-to-end via SDK; `edited.Message.ModelConfigID` and `chat.LastModelConfigID` both match the new model. - `InvalidModelConfigID`: random UUID -> 400 `Invalid model config ID.`. </details>	2026-05-12 14:51:55 +02:00
Thomas Kosiewski	5c3b59151e	feat: add Cmd/Ctrl+Enter send setting (#25062 ) Adds an Agents General setting to require Cmd/Ctrl+Enter before sending chat messages. When enabled, plain Enter inserts a newline in agent chat inputs while the send button remains available. The preference is now persisted server-side through `/api/v2/users/{user}/preferences`, alongside the existing user preference settings, and is applied to both the create-agent input and existing chat composer. Storybook and API coverage verify the setting, keyboard behavior, validation, and persistence. <details> <summary>Coder Agents notes</summary> Generated by Coder Agents from a Slack request. Dogfooded with agent-browser against the Storybook settings and chat input stories. </details>	2026-05-12 10:09:34 +02:00
Thomas Kosiewski	e56381eb61	feat: stream advisor tool output (#25032 ) Stream advisor output into the advisor tool card while the nested advisor call is still running. This keeps the advisor implementation intentionally advisor-specific: the parent model still receives the same final structured tool result, while the frontend receives transient `tool-result.result_delta` parts to render partial advisor text in the expanded card. The final persisted chat history remains unchanged. Refs CODAGT-322. Generated by Coder Agents. <details> <summary>Implementation plan</summary> - Publish advisor text deltas from the nested `chatloop.Run` via `RunAdvisorOptions.OnAdviceDelta`. - Forward those deltas through `chatadvisor.Tool` with the parent advisor tool call ID. - Emit transient `ChatMessagePartTypeToolResult` websocket parts with `ResultDelta` from `chatd`. - Add `result_delta` to the generated tool-result TypeScript variant. - Accumulate tool result deltas in frontend stream state and keep the tool running until the final result arrives. - Render streamed advisor advice in the existing advisor card using streaming markdown mode, while retaining the updated advisor UI. </details>	2026-05-11 20:18:49 +02:00
Steven Masley	19573e8aee	feat!: patchTemplateMeta to use optional fields (#24984 ) Closes https://github.com/coder/coder/issues/13112 Breaking Change: Removed status code `StatusNotModified` when no diffs occur in a patch. Now the patch is always applied and a template is always returned.	2026-05-11 12:43:52 -05:00
Jeremy Ruppel	a1dbd758bc	feat: add template builder deployment config and telemetry types (#25082 )	2026-05-11 09:48:55 -04:00
Marcin Tojek	febabfb8b2	feat: add request/response dump support to aibridgeproxyd (#24837 ) Closes https://github.com/coder/coder/issues/24335	2026-05-11 10:59:26 +02:00
Kyle Carberry	aaa0dacdb3	fix: infer workspace claim time from build history for /agents delete dialog (#25057 ) Closes [CODAGT-317](https://linear.app/codercom/issue/CODAGT-317/pr-workspaces-sometimes-require-name-confirmation-to-delete). ## Problem The `/agents` archive-and-delete molly-guard (typing the workspace name) was firing for chats that had clearly created their own workspace. The heuristic in `resolveArchiveAndDeleteAction` decides whether confirmation is needed by comparing the workspace's `created_at` against the chat's `created_at`: ```ts return new Date(workspaceCreatedAt) >= new Date(chatCreatedAt); ``` That assumption breaks for prebuilt workspaces. `ClaimPrebuiltWorkspace` rewrites `owner_id`, `name`, `updated_at`, `last_used_at`, etc., but never touches `created_at`, which still reflects when the prebuild was provisioned by the reconciler, often hours before the chat exists. Result: every prebuild-claimed workspace looks pre-existing, so the molly-guard fires. Concrete example from a real chat: \| Field \| Value \| \|---\|---\| \| `chat.created_at` \| `2026-05-07T15:12:23Z` \| \| `workspace.created_at` (provision) \| `2026-05-07T14:22:24Z` \| \| `latest_build.created_at` (claim) \| `2026-05-07T15:19:09Z` \| `14:22:24 < 15:12:23` so `isWorkspaceAutoCreated` returned false even though the chat issued the claim. ## Fix (frontend-only) Derive the moment a workspace was acquired from existing build history rather than relying on `workspace.created_at`: - Build #1 initiator = prebuilds system user → workspace was a prebuild → use `build_2.created_at` (the claim build) as the acquisition time. - Build #1 initiator = real user → workspace was created from scratch → use `workspace.created_at` (unchanged behavior). - Unclaimed prebuild or no build history → return `null` (force confirmation; safe degradation for a destructive flow). The resolver fetches the build list via the existing `getWorkspaceBuilds` endpoint when the dialog might fire. No new column, no migration, no schema change. Works retroactively for all existing claimed prebuilds; no backfill needed. The prebuilds system user UUID is exposed via `codersdk.PrebuildsSystemUserID` and typegen'd to `typesGenerated.ts`. `coderd/database.PrebuildsSystemUserID` parses that constant via `uuid.MustParse` so the two cannot drift; if the codersdk literal ever changes, package init fails fast. ## History The first draft of this PR added a `workspaces.claimed_at` column populated by `ClaimPrebuiltWorkspace`. After review feedback from @johnstcn pointing out that the same fact is already implicit in build history, I pivoted to the frontend-only approach. Subsequent review notes consolidated the prebuilds system user UUID into a single typegen'd constant. ## Why not the other open PRs - #25055 (`chatKey` cache fallback) only fixes a different cache-miss path; it explicitly notes it does not address `created_at < chat.created_at`. - #25053 (`chats.workspace_auto_created` boolean) puts the truth on the wrong side of the schema: "this workspace was claimed at time T" is a property of the workspace, not the chat. The MCP plumbing it adds is also unnecessary now that the same answer is available from build history. ## Test plan - `pnpm vitest run --project=unit src/pages/AgentsPage/utils/agentWorkspaceUtils.test.ts` — 40/40 pass; new cases cover prebuild claim before/after chat, unclaimed prebuild, missing-build-history fallback, and the fetch-skip when the chat is not in cache. - `pnpm lint:types`, `pnpm check`, `make pre-commit`. <details> <summary>Disclosure</summary> Opened on behalf of @kylecarbs by [Coder Agents](https://coder.com/coder-agents). </details>	2026-05-10 11:04:55 -04:00
Yevhenii Shcherbina	4124d1137d	feat: add ai_model_prices table (#24932 ) # Summary Implements https://linear.app/codercom/issue/AIGOV-282/add-ai-model-price-table-and-seed-generator This PR lays the groundwork for AI Bridge cost controls (per the AI Governance RFC). It adds the foundation needed for future cost tracking: a place to store per-model token prices, a way to keep those prices in sync with upstream pricing data, and a startup mechanism that ensures every deployment has prices loaded before AI Bridge starts processing requests. The price data comes from [models.dev](https://models.dev/), a community-maintained catalogue of AI provider pricing. A generator script fetches the latest prices, filters to Anthropic and OpenAI for now, and produces a seed file checked into the repository. On every server startup the seed is applied to the database, so new releases automatically pick up any price corrections that landed since the previous one. Existing rows are overwritten with the latest prices; rows for models no longer in the seed are left untouched. # Batching the AI model price seed: three approaches Context: at server startup we seed the `ai_model_prices` table from an embedded JSON price book (~70 rows today, will grow as we add providers, potentially 4000+). Each row is: ```text (provider, model, input_price, output_price, cache_read_price, cache_write_price) ``` Any of the four price columns can be: - `NULL` → “price unknown for this dimension” - explicit `0` → “free” The batch must be an UPSERT so re-running is idempotent and existing rows pick up new prices. We considered three implementations. --- ## Approach 1 — Per-row UPSERT in a Go loop ```go for _, row := range rows { if err := db.UpsertAIModelPrice(ctx, database.UpsertAIModelPriceParams{ Provider: row.Provider, Model: row.Model, InputPrice: nullInt64(row.InputPrice), // ... }); err != nil { return err } } ``` ### Pros - Trivial. - NULL handling falls out naturally from `sql.NullInt64`. ### Cons - `N` round-trips per seed. - With ~70 rows that means ~70 statement executions on every startup, even inside a transaction. - Doesn't scale gracefully as the price book grows, potentially 4000+. --- ## Approach 2 — `UNNEST` with parallel arrays Pass each column as a separate Go slice. Postgres unnests them in parallel into a virtual table, then `INSERT ... SELECT`. ```sql INSERT INTO ai_model_prices ( provider, model, input_price, output_price, cache_read_price, cache_write_price ) SELECT UNNEST(@providers::text[]), UNNEST(@models::text[]), NULLIF(UNNEST(@input_prices::bigint[]), -1), NULLIF(UNNEST(@output_prices::bigint[]), -1), NULLIF(UNNEST(@cache_read_prices::bigint[]), -1), NULLIF(UNNEST(@cache_write_prices::bigint[]), -1) ON CONFLICT (provider, model) DO UPDATE SET input_price = EXCLUDED.input_price, output_price = EXCLUDED.output_price, cache_read_price = EXCLUDED.cache_read_price, cache_write_price = EXCLUDED.cache_write_price, updated_at = NOW(); ``` Go side: flatten rows into six parallel slices. Use a sentinel (`-1`) for “missing”, since `lib/pq` can't encode `NULL` into a `bigint[]` element. ```go providers := make([]string, len(rows)) models := make([]string, len(rows)) inputs := make([]int64, len(rows)) outputs := make([]int64, len(rows)) cacheR := make([]int64, len(rows)) cacheW := make([]int64, len(rows)) for i, r := range rows { providers[i] = r.Provider models[i] = r.Model inputs[i] = -1 if r.InputPrice != nil { inputs[i] = r.InputPrice } outputs[i] = -1 if r.OutputPrice != nil { outputs[i] = r.OutputPrice } cacheR[i] = -1 if r.CacheReadPrice != nil { cacheR[i] = r.CacheReadPrice } cacheW[i] = -1 if r.CacheWritePrice != nil { cacheW[i] = r.CacheWritePrice } } return db.UpsertAIModelPrices(ctx, database.UpsertAIModelPricesParams{ Providers: providers, Models: models, InputPrices: inputs, OutputPrices: outputs, CacheReadPrices: cacheR, CacheWritePrices: cacheW, }) ``` ### Pros - Single round-trip. ### Cons - The generated `sqlc` params become plain `[]int64`, which can't represent `NULL`. --- ## Approach 3 — `jsonb_array_elements` over a single `@seed::jsonb` (chosen) Pass the raw seed JSON as one parameter; let Postgres expand and parse it. ```sql INSERT INTO ai_model_prices ( provider, model, input_price, output_price, cache_read_price, cache_write_price ) SELECT elem->>'provider', elem->>'model', (elem->>'input_price')::bigint, (elem->>'output_price')::bigint, (elem->>'cache_read_price')::bigint, (elem->>'cache_write_price')::bigint FROM jsonb_array_elements(@seed::jsonb) AS elem ON CONFLICT (provider, model) DO UPDATE SET input_price = EXCLUDED.input_price, output_price = EXCLUDED.output_price, cache_read_price = EXCLUDED.cache_read_price, cache_write_price = EXCLUDED.cache_write_price, updated_at = NOW(); ``` Go side reduces to: ```go return db.UpsertAIModelPrices(ctx, seedJSON) ``` ### Pros - Single round-trip. - NULLs fall out naturally: - `(elem->>'cache_write_price')::bigint` becomes `NULL` - no sentinels - The seed is already JSON: - Existing precedent: - `jsonb_array_elements` is already used elsewhere in the codebase ### Cons - Less type-safe at the SQL boundary than `UNNEST` - Slightly less standard than `UNNEST` - Readers need familiarity with: - `jsonb_array_elements` - `->>` extraction syntax - Postgres pays JSON parse cost - negligible at our scale --- --- # Decision We picked Approach 3. It collapses the round-trips like `UNNEST` does, but without: - nullable-array workarounds - sentinel values	2026-05-08 16:45:14 -04:00
Danielle Maywood	e7958713a9	feat: add code diff display mode preference (#25027 )	2026-05-07 20:15:28 +01:00

1 2 3 4 5 ...

1573 Commits