coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 13:08:25 +00:00

Author	SHA1	Message	Date
Michael Suchacz	c3b6284955	feat: add chat cost analytics backend (#23036 ) Add cost tracking for LLM chat interactions with microdollar precision. ## Changes - Add `chatcost` package for per-message cost calculation using `shopspring/decimal` for intermediate arithmetic - Ceil rounding policy: fractional micros round UP to next whole micro (applied once after summing all components) - Database migration: `total_cost_micros` BIGINT column with historical backfill and `created_at` index - API endpoints: per-user cost summary and admin rollup under `/api/experimental/chats/cost/` - SDK types: `ChatCostSummary`, `ChatCostModelBreakdown`, `ChatCostUserRollup` - Fix `modeloptionsgen` to handle `decimal.Decimal` as opaque numeric type - Update frontend pricing test fixtures for string decimal types ## Design decisions - `NULL` = unpriced (no matching model config), `0` = free - Reasoning tokens included in output tokens (no double-counting) - Integer microdollars (BIGINT) for storage and API responses - Price config uses `decimal.Decimal` for exact parsing; totals use `int64` Frontend: #23037	2026-03-13 18:30:49 +01:00
Cian Johnston	e9025f91e8	chore(db): remove 23 unused database methods (#22999 ) Removes 22 database query methods with no callers outside generated code and the dbauthz wrapper layer (~1,600 lines). Security keys (6) — superseded by `cryptokeys` package: `GetAppSecurityKey`, `UpsertAppSecurityKey`, `GetOAuthSigningKey`, `UpsertOAuthSigningKey`, `GetCoordinatorResumeTokenSigningKey`, `UpsertCoordinatorResumeTokenSigningKey` Superseded queries (4): - `GetProvisionerJobsByIDs` → `GetProvisionerJobsByIDsWithQueuePosition` - `GetDeploymentDAUs` / `GetTemplateDAUs` → `GetTemplateInsightsByInterval` - `GetWorkspaceBuildParametersByBuildIDs` + its `GetAuthorized...` variant → unused OAuth2 (2): `GetOAuth2ProviderAppByRegistrationToken`, `UpdateOAuth2ProviderAppSecretByID` Chat (4) — pre-wired with no callers: `GetChatModelConfigByProviderAndModel`, `DeleteChatMessagesByChatID`, `ListChatsByRootID`, `ListChildChatsByParentID` Other (6): `DeleteGitSSHKey`, `UpdateUserLinkedID`, `GetFileIDByTemplateVersionID`, `GetTemplateVersionHasAITask`, `InsertUserGroupsByName`, `RemoveUserFromAllGroups`	2026-03-12 21:32:57 +00:00
Kyle Carberry	58f295059c	fix: grant chatd ActionReadPersonal on User and parallelize runChat DB calls (#22970 ) ## Problem 1. Personal behavior prompt not applied: The chatd background worker was missing `ActionReadPersonal` on `ResourceUser` in its RBAC subject. When `resolveUserPrompt` calls `GetUserChatCustomPrompt`, the dbauthz layer checks `ActionReadPersonal` on the user — which the chatd role didn't have. The error was silently swallowed (returns `""`), so the user's custom prompt was never injected into the system messages. 2. Sequential DB calls on chat startup: Several independent database queries in `runChat` and `resolveChatModel` were running sequentially, adding unnecessary latency before the LLM stream begins. ## Changes ### RBAC fix (`dbauthz.go`) - Add `rbac.ResourceUser.Type: {policy.ActionReadPersonal}` to `subjectChatd` site permissions - This is the minimal permission needed — `ActionRead` on User remains denied ### Parallelization (`chatd.go`) Three parallelization points using `errgroup.Group`: 1. `resolveChatModel`: `resolveModelConfig` and `GetEnabledChatProviders` run concurrently (both needed for `ModelFromConfig`, which stays sequential after the wait) 2. `runChat` startup: `resolveChatModel` and `GetChatMessagesForPromptByChatID` run concurrently (completely independent) 3. `runChat` prompt assembly: `resolveInstructions` and `resolveUserPrompt` run concurrently (both produce strings; `InsertSystem` calls maintain correct order after the wait) Same pattern applied to the `ReloadMessages` callback. ### Test (`dbauthz_test.go`) - Add assertion in `TestAsChatd/AllowedActions` that `ActionReadPersonal` on `ResourceUser` is permitted	2026-03-11 22:07:46 +00:00
Kyle Carberry	1f37df4db3	perf(chatd): fix six scale bottlenecks identified by benchmarking (#22957 ) ## Summary Scale-tested the `chatd` package with mock-based benchmarks to identify performance bottlenecks. This PR fixes 6 of the 8 identified issues, ranked by severity. ## Changes ### 1. Parallel tool execution (HIGH) — `chatloop.go` `executeTools` ran tool calls sequentially. Now dispatches all calls concurrently via goroutines with `sync.WaitGroup`. Results are pre-allocated by index (no mutex needed). `onResult` callbacks fire as each tool completes. ### 2. Pubsub-backed subagent await (HIGH) — `subagent.go` `awaitSubagentCompletion` polled the DB every 200ms. Now subscribes to the child chat's `ChatStreamNotifyChannel` via pubsub for near-instant notifications. Fallback poll reduced to 5s. Falls back to 200ms only when `pubsub == nil` (single-instance / in-memory). ### 3. Per-chat stream locking (MEDIUM) — `chatd.go` Replaced single global `streamMu` + `map[uuid.UUID]*chatStreamState` with `sync.Map` where each `chatStreamState` has its own `sync.Mutex`. Zero cross-chat contention. ### 4. Batch chat acquisition (MEDIUM) — `chatd.go` `processOnce` acquired 1 chat per tick. Now loops up to `maxChatsPerAcquire = 10` per tick, avoiding idle time when many chats are pending. ### 5. Reduced heartbeat frequency (LOW-MEDIUM) — `chatd.go` `chatHeartbeatInterval` changed from 30s to 60s. Safe given the 5-minute `DefaultInFlightChatStaleAfter`. ### 6. O(depth) descendant check (LOW) — `subagent.go` Replaced top-down BFS (`O(total_descendants)` queries) with bottom-up parent-chain walk (`O(depth)` queries). Includes cycle protection. ## Not addressed (intentionally) - Message serialization overhead - Buffer eviction (`buffer[1:]` pattern)	2026-03-11 14:00:08 -04:00
Cian Johnston	bc27274aba	feat(coderd): refactors github pr sync functionality (#22715 ) - Adds `_API_BASE_URL` to `CODER_EXTERNAL_AUTH_CONFIG_` - Extracts and refactors existing GitHub PR sync logic to new packages `coderd/gitsync` and `coderd/externalauth/gitprovider` - Associated wiring and tests Created using Opus 4.6	2026-03-10 18:46:01 +00:00
Kyle Carberry	b6d1a11c58	feat(chatd): add user-level custom prompt for agent chats (#22896 ) Adds a user-level custom prompt to the database. I'll be doing a follow-up for the UI, as we currently do not have user-level settings (it's just admin). I'll also make it very obvious for chats where there is a user-level prompt, but I don't know how yet.	2026-03-10 11:17:52 -04:00
Danielle Maywood	6489d6f714	feat(chatd): use last assistant message as push notification summary (#22671 ) Instead of the static 'Agent has finished running.' text, extract a summary from the last assistant message to give users meaningful context about what the agent accomplished. Falls back to the static text if no suitable message is found. Co-authored-by: Kyle Carberry <kyle@carberry.com>	2026-03-10 15:14:15 +00:00
Cian Johnston	c933ddcffd	fix(agents): persist system prompt server-side instead of localStorage (#22857 ) ## Problem The Admin → Agents → System Prompt textarea saved only to the browser's `localStorage`. The value was never sent to the backend, never stored in the database, and never injected into chats. Entering text, clicking Save, and refreshing the page showed no changes — the prompt was effectively a no-op. ## Root Cause Three disconnected layers: 1. Frontend wrote to `localStorage`, never called an API. 2. `handleCreateChat` never read `savedSystemPrompt`. 3. Backend hardcoded `chatd.DefaultSystemPrompt` on every chat creation — no field in `CreateChatRequest` accepted a custom prompt. ## Changes ### Database - Added `GetChatSystemPrompt` / `UpsertChatSystemPrompt` queries on the existing `site_configs` table (no migration needed). ### API - `GET /api/experimental/chats/system-prompt` — returns the configured prompt (any authenticated user). - `PUT /api/experimental/chats/system-prompt` — sets the prompt (admin-only, `rbac: deployment_config update`). - Input validation: max 32 KiB prompt length. ### Backend - `resolvedChatSystemPrompt(ctx)` checks for a custom prompt in the DB, falls back to `chatd.DefaultSystemPrompt` when empty/unset. - Logs a warning on DB errors instead of silently swallowing them. - Replaced the hardcoded `defaultChatSystemPrompt()` call in chat creation. ### Frontend - Replaced `localStorage` read/write with React Query `useQuery`/`useMutation` backed by the new endpoints. - Fixed `useEffect` draft sync to avoid clobbering in-progress user edits on refetch. - Added `try/catch` error handling on save (draft stays dirty for retry). - Save button disabled during mutation (`isSavingSystemPrompt`). - Query key follows kebab-case convention (`chat-system-prompt`). ### UX - Added hint: "When empty, the built-in default prompt is used." ### Tests - `TestChatSystemPrompt`: GET returns empty when unset, admin can set, non-admin gets 403. - dbauthz `TestMethodTestSuite` coverage for both new querier methods.	2026-03-10 11:46:53 +00:00
Mathias Fredriksson	a104d608a3	feat: add file/image attachment support to chat input (#22604 ) This change adds support for image attachments to chat via add button and clipboard paste. Files are stored in a new `chat_files` table and referenced by ID in message content. File data is resolved from storage at LLM dispatch time, keeping the message content column small. Upload validates MIME types via content type or content sniffing against an allowlist (png, jpeg, gif, webp). The retrieval endpoint serves files with immutable caching headers. On the frontend, uploads start eagerly on attach with a background fetch to pre-warm the browser HTTP cache so the timeline renders instantly after send.	2026-03-06 21:05:26 +02:00
Kayla はな	56bdea73b8	feat: add workspace acls to task rbac objects (#22311 ) To allow tasks to be shareable, we need to share both the `task` resource and the `workspace` resource, and their sharing state needs to be kept in sync. We've already implemented all of the necessary ACL functionality for workspaces, so we can just sort of proxy those ACLs back to the task as well.	2026-03-05 13:40:53 -07:00
Danielle Maywood	d2d956edb1	fix: add archived query parameter to chat list endpoint (#22562 ) Despite the SDK type having an `Archived` field for chats, this data was never fetched from the database — the `GetChatsByOwnerID` query hardcoded `AND archived = false`, and the `convertChat` function never mapped the field. This PR adds an optional `archived` query parameter to `GET /api/experimental/chats`: \| Value \| Behavior \| \|-------\|----------\| \| (not provided) \| Returns all chats (active and archived) \| \| `archived=false` \| Returns only non-archived chats \| \| `archived=true` \| Returns only archived chats \| This follows the same pattern used by template versions (`sqlc.narg('archived')` nullable boolean). Also fixes `convertChat` to populate the `Archived` field in API responses, which was never being set despite existing on the SDK type.	2026-03-03 20:39:19 +00:00
Danny Kopping	1b08bc76a6	feat: store tool call IDs to determine interception lineage (#22246 ) Adds database columns and server-side logic to track interception lineage via tool call IDs. When an interception ends, the server resolves the correlating tool call ID to find the parent interception and links them via `parent_id`. New `provider_tool_call_id` column on `aibridge_tool_usages` and `parent_id` column on `aibridge_interceptions`, with indexes for lookup. `findParentInterceptionID` queries by tool call ID and filters out the current interception to find the parent. Adapted from the [coder/coder `dk/prompt_provenance_poc`](https://github.com/coder/coder/compare/main...dk/prompt_provenance_poc) branch. Depends on [coder/aibridge#188](https://github.com/coder/aibridge/pull/188). Closes https://github.com/coder/internal/issues/1334	2026-03-03 21:04:41 +02:00
Kyle Carberry	5eebd3829f	fix: use cursor-based query for chat stream notifications (#22510 ) ## Problem The pubsub notification handler in `chatd` re-fetched all messages from the DB on every new message notification, then filtered in Go with `msg.ID > lastMessageID`. This grows linearly with conversation length — every new message triggers a full table scan of that chat's history. The `AfterMessageID` field in the pubsub notification payload was clearly designed for cursor-based fetching, but no matching query existed. ## Fix - Add `GetChatMessagesByChatIDAfter` SQL query with `WHERE id > @after_id`, so the database does the filtering instead of Go. - Use it in the pubsub notification handler in `chatd.go`, passing `lastMessageID` as the cursor. - Implement the dbauthz wrapper (was a `panic("not implemented")` stub from codegen) with the same read-check-on-parent-chat pattern as adjacent methods. - Add dbauthz test coverage for the new method. Not changed: The initial snapshot in `Subscribe()` still loads all messages — that's correct, since a newly-connecting client needs the full conversation state. The waste was only in the ongoing notification path.	2026-03-02 16:31:04 -05:00
Cian Johnston	a62f2fbfc4	feat(rbac): add AsChatd subject to replace AsSystemRestricted in chatd (#22487 ) Add a new SubjectTypeChatd RBAC subject with minimal permissions: - Chat: CRUD - Workspace: Read - DeploymentConfig: Read Replace all 10 AsSystemRestricted calls in coderd/chatd/chatd.go: - Line 890: Use AsChatd instead of AsSystemRestricted for the background processor context. - Subscribe() path (5 calls): Remove system escalation entirely; these run under the authenticated user's context from the HTTP handler. - processChat path (4 calls): Remove redundant per-call wraps; the context already carries AsChatd from the processor start. Add TestAsChatd verifying allowed and denied actions. Created using Mux (Opus 4.6)	2026-03-02 15:57:04 +00:00
Kyle Carberry	12083441e0	feat(chats): archive chats instead of hard-deleting them (#22406 ) ## Summary The UI has always labeled the action as "Archive agent" but the backend was performing a hard `DELETE`, permanently destroying chats and all their messages. This change replaces the hard delete with a soft archive, consistent with the pattern used by template versions. ## Changes ### Database - Migration 000423: Add `archived boolean DEFAULT false NOT NULL` column to `chats` table - Replace `DeleteChatByID` query with `ArchiveChatByID` (`UPDATE SET archived = true`) - Add `UnarchiveChatByID` query (`UPDATE SET archived = false`) - Filter archived chats from `GetChatsByOwnerID` (`WHERE archived = false`) ### API - Remove `DELETE /api/experimental/chats/{chat}` - Add `POST /api/experimental/chats/{chat}/archive` — archives a chat and all its descendants - Add `POST /api/experimental/chats/{chat}/unarchive` — unarchives a single chat (API only, no UI yet) ### Backend - `archiveChatTree()` recursively archives child chats (replaces `deleteChatTree()` which hard-deleted) - Chat daemon's `ArchiveChat()` archives the full chat tree in a transaction - Authorization uses `ActionUpdate` instead of `ActionDelete` ### SDK - Replace `DeleteChat()` with `ArchiveChat()` and `UnarchiveChat()` - Add `Archived` field to `Chat` struct ### Frontend - `archiveChat` API call uses `POST .../archive` instead of `DELETE` - No UI changes — the "Archive agent" button now actually archives instead of deleting ## Design Decision This follows the template version archive pattern (Pattern B in the codebase): - `archived boolean` column (not `deleted boolean`) - Dedicated `POST .../archive` and `POST .../unarchive` routes (not repurposing `DELETE`) - Reversible — users can unarchive via the API (UI for this will come later)	2026-02-27 16:46:19 -05:00
Kyle Carberry	edee917d88	feat: add experimental agents support (#22290 ) feat: add AI chat system with agent tools and chat UI Introduce the chatd subsystem and Agents UI for AI-powered chat within Coder workspaces. - Add chatd package with chat loop, message compaction, prompt management, and LLM provider integration (OpenAI, Anthropic) - Add agent tools: create workspace, list/read templates, read/write/ edit files, execute commands - Add chat API endpoints with streaming, message editing, and durable reconnection - Add database schema and migrations for chats, chat messages, chat providers, and chat model configs - Add RBAC policies and dbauthz enforcement for chat resources - Add Agents UI pages with conversation timeline, queued messages list, diff viewer, and model configuration panel - Add comprehensive test coverage including coderd integration tests, chatd unit tests, and Storybook stories - Gate feature behind experiments flag --------- Co-authored-by: Cian Johnston <cian@coder.com> Co-authored-by: Danielle Maywood <danielle@themaywoods.com> Co-authored-by: Jeremy Ruppel <jeremy@coder.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-27 16:50:56 +00:00
Jake Howell	d2787df442	feat: add AI Bridge request logs model filter (#22230 ) This pull-request implements a simple filtering logic so that we're able to pick which model the user actually used when logs were sent to AI Bridge. - Add `GET /aibridge/models` API endpoint that returns distinct model names from AI Bridge interceptions, with pagination and search support - New `ListAIBridgeModels` SQL query using case-sensitive prefix matching (`LIKE model \|\| '%'`) to allow B-tree index usage - Hand-written `ListAuthorizedAIBridgeModels` in `modelqueries.go` for RBAC authorization filter injection - `AIBridgeModels` search query parser in searchquery/search.go (defaults bare terms to the `model` field) - dbauthz wrappers, dbmetrics, and dbmock implementations for the new query <img width="292" height="185" alt="image" src="https://github.com/user-attachments/assets/134771df-2d26-4c54-acc4-27f58128b351" />	2026-02-26 02:40:45 +11:00
Cian Johnston	6336fee3a7	feat: add telemetry for task lifecycle events (#21922 ) Relates to https://github.com/coder/internal/issues/1259 Adds new database queries and telemetry collection functions to gather task lifecycle events (pause/resume cycles, idle time) for analytics. Task events track pause/resume activity, idle duration before pausing, paused duration, and time from resume to first app status, filtered to recent activity based on the telemetry snapshot interval. 🤖 Created with Mux (Opus 4.6).	2026-02-24 17:04:42 +00:00
Kacper Sawicki	1e274063d4	feat(coderd): filter expired API tokens server-side (#22263 ) ## Summary Moves expired token filtering from client-side to server-side by adding an `include_expired` parameter to the `GetAPIKeysByLoginType` and `GetAPIKeysByUserID` database queries. This is more efficient for large deployments with many expired/short-lived tokens. ## Changes - Add `include_expired` parameter to SQL queries using `OR` short-circuit - Add `include_expired` query parameter to `GET /users/{user}/keys/tokens` - Add `IncludeExpired` field to `codersdk.TokensFilter` - Remove client-side filtering from CLI `tokens list` command - Add `TestTokensFilterExpired` test Fixes coder/internal#1357	2026-02-24 15:27:03 +00:00
Jon Ayers	0a7a3da178	fix: exclude provisioner_state from workspace_build_with_user view (#22159 ) The provisioner state for a workspace build was being loaded for every long-lived agent rpc connection. Since this state can be anywhere from kilobytes to megabytes this can gradually cause the `coderd` memory footprint to grow over time. It's also a lot of unnecessary allocations for every query that fetches a workspace build since only a few callers ever actually reference the provisioner state. This PR removes it from the returned workspace build and adds a query to fetch the provisioner state explicitly.	2026-02-23 22:46:17 -06:00
Jon Ayers	6035e45cb8	feat: add e2e workspace build duration metric (#21739 ) Adds coderd_template_workspace_build_duration_seconds histogram that tracks the full duration from workspace build creation to agent ready. This captures the complete user-perceived build time including provisioning and agent startup. The metric is emitted when the agent reports ready/error/timeout via the lifecycle API, ensuring each build is counted exactly once per replica.	2026-02-06 16:26:02 -06:00
Zach	a31e476623	fix: make boundary usage telemetry collection atomic (#21907 ) Previously, UpsertBoundaryUsageStats (INSERT...ON CONFLICT DO UPDATE) and GetAndResetBoundaryUsageSummary (DELETE...RETURNING) could race during telemetry period cutover. Without serialization, an upsert concurrent with the delete could lose data (deleted right after being written) or commit after the delete (miscounted in the next period). Both operations now acquire LockIDBoundaryUsageStats within a transaction to ensure a clean cutover.	2026-02-06 09:52:17 -07:00
Mathias Fredriksson	c60c373bc9	fix(coderd): clean up task snapshots on task deletion (#21949 ) Task snapshots were orphaned when tasks were soft-deleted. The `task_snapshots` table has an `ON DELETE CASCADE` foreign key, but that only fires on hard deletes. Modified DeleteTask to use a CTE that atomically soft-deletes the task and removes its snapshot in a single transaction. The query now returns just the task UUID instead of the full row. Closes coder/internal#1283	2026-02-06 11:55:33 +02:00
Danielle Maywood	af0e171595	feat(coderd/agentapi): support terraform-defined subagent ids (#21837 ) Update `coderd/agentapi` to handle pre-created sub agents	2026-02-04 15:33:48 +00:00
Zach	7dfa33b410	feat: add boundary usage tracking database schema and tracker skeleton (#21670 ) feat: add boundary usage telemetry database schema and RBAC Adds the foundation for tracking boundary usage telemetry across Coder replicas. This includes: - Database schema: `boundary_usage_stats` table with per-replica stats (unique workspaces, unique users, allowed/denied request counts) - Database queries: upsert stats, get aggregated summary, reset stats, delete by replica ID - RBAC: `boundary_usage` resource type with read/update/delete actions, accessible only via system `BoundaryUsageTracker` subject (not regular user roles) - Tracker skeleton + docs: stub implementation in `coderd/boundaryusage/` The tracker accumulates stats in memory and periodically flushes to the database. Stats are aggregated across replicas for telemetry reporting, then reset when a new reporting period begins. The tracker implementation and plumbing will be done in a subsequent commit/PR. --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 13:29:21 -07:00
George K	c352a51b22	fix(coderd): authorize workspace start/stop/delete by transition action (#21691 ) Use transition-specific actions when authorizing workspace build parameter inserts in the database layer so start/stop/delete do not require workspace.update. Related to: https://github.com/coder/internal/issues/1299	2026-01-27 09:08:12 -08:00
Mathias Fredriksson	25d7f27cdb	feat(coderd): add task log snapshot storage endpoint (#21644 ) This change adds a POST /workspaceagents/me/tasks/{task}/log-snapshot endpoint for agents to upload task conversation history during workspace shutdown. This allows users to view task logs even when the workspace is stopped. The endpoint accepts agentapi format payloads (typically last 10 messages, max 64KB), wraps them in a format envelope, and upserts to the task_snapshots table. Uses agent token auth and validates the task belongs to the agent's workspace. Closes coder/internal#1253	2026-01-27 11:09:24 +02:00
Spike Curtis	f47f89d997	chore: remove unused tailnet v1 tables and queries (#21646 ) Removes the legacy tailnet v1 API tables (`tailnet_clients`, `tailnet_agents`, `tailnet_client_subscriptions`) and their associated queries, triggers, and functions. These were superseded by the v2 tables (`tailnet_peers`, `tailnet_tunnels`) in migration 000168, and the v1 API code was removed in commit `d6154c4310`, but the database artifacts were never cleaned up. Changes: - New migration `000410_remove_tailnet_v1_tables` to drop the unused tables - Removed 11 unused queries from `tailnet.sql` - Removed associated manual wrapper methods in `dbauthz` and `dbmetrics` - ~930 lines deleted across 11 files	2026-01-26 14:27:17 +04:00
Callum Styan	e195856c43	perf: reduce pg_notify call volume by batching together agent metadata updates (#21330 ) --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-22 22:47:49 -08:00
Mathias Fredriksson	97e8a5b093	fix(coderd): allow agent auth during workspace shutdown (#21538 ) Agents were losing authentication during workspace shutdown, causing shutdown scripts to fail. The auth query required agents to belong to the latest build, but during shutdown a `stop` build becomes latest while the `start` build's agents are still running. Modified the auth query to allow `start` build agents to authenticate temporarily during `stop` execution. The query allows auth when: - Agent's `start` build job succeeded - Latest build is `stop` with `pending`/`running` job status - Builds are adjacent (`stop` is `build_number + 1`) - Template versions match Auth closes once `stop` completes. Renamed `GetWorkspaceAgentAndLatestBuildByAuthToken` to `GetAuthenticatedWorkspaceAgentAndBuildByAuthToken` since it returns the agent's build (not always latest) during shutdown. Closes coder/internal#1249 Fixes #19467	2026-01-21 13:18:43 +00:00
Cian Johnston	08343a7a9f	perf: reduce number of queries made by /api/v2/workspaceagents/{id} (#21522 ) Relates to https://github.com/coder/internal/issues/1214 The `ExtractWorkspaceAgentParam` middleware ends up making 4 database queries to follow the chain of `WorkspaceAgent` -> `WorkspaceResource` -> `ProvisionerJob` -> `WorkspaceBuild` -- but then dropping all that hard work on the floor. The `api.workspaceAgent` handler that references this middleware then has to do all of that work again, plus one more query to get the related `User` so we can get the username. This pattern is also mirrored in `getDatabaseTerminal` but without the middleware. This PR: * Adds a new query `GetWorkspaceAgentAndWorkspaceByID` to fetch all this information at once to avoid the multiple round-trips, * Updates the existing usage of `GetWorkspaceAgentByID` to this new query instead, * Updates `ExtractWorkspaceAgentParam` to also store the workspace in the request context Dalibo: [0.63ms](https://explain.dalibo.com/plan/40bb597f3539gc6c)	2026-01-19 12:36:33 +00:00
George K	0712faef4f	feat(enterprise): implement organization "disable workspace sharing" option (#21376 ) Adds a per-organization setting to disable workspace sharing. When enabled, all existing workspace ACLs in the organization are cleared and the workspace ACL mutation API endpoints return `403 Forbidden`. This complements the existing site-wide `--disable-workspace-sharing` flag by providing more granular control at the organization level. Closes https://github.com/coder/internal/issues/1073 (part 2) --------- Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>	2026-01-14 09:47:50 -08:00
George K	cc2efe9e1f	feat(coderd/rbac): make organization-member a per-org system custom role (#21359 ) Migrated the built-in organization-member role to DB storage so it can be customized per org. Closes https://github.com/coder/internal/issues/1073 (part 1)	2026-01-12 18:19:19 -08:00
Spike Curtis	49b34a716a	fix: fix slog to always use array of Fields (#21426 ) Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder). It also updates dependencies that also use slog and were updated. I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule. Other dependencies, I pushed new tags.	2026-01-08 10:29:41 +04:00
Jake Howell	ea00e72063	feat: add rbac specificity for `dbpurge` (#21088 ) Related to [`internal#1139`](https://github.com/coder/internal/issues/1139) Continuation of #21074 This implements some RBAC role specificity for `dbpurge`, ensuring that we follow the least-privileged model for removing data from the database. It is specified as following. ```go Site: rbac.Permissions(map[string][]policy.Action{ // DeleteOldWorkspaceAgentLogs // DeleteOldWorkspaceAgentStats // DeleteOldProvisionerDaemons // DeleteOldTelemetryLocks // DeleteOldAuditLogConnectionEvents // DeleteOldConnectionLogs rbac.ResourceSystem.Type: {policy.ActionDelete}, // DeleteOldNotificationMessages rbac.ResourceNotificationMessage.Type: {policy.ActionDelete}, // ExpirePrebuildsAPIKeys // DeleteExpiredAPIKeys rbac.ResourceApiKey.Type: {policy.ActionDelete}, // DeleteOldAIBridgeRecords rbac.ResourceAibridgeInterception.Type: {policy.ActionDelete}, }), ``` \| Position \| Pull-request \| \| -------- \| ------------ \| \| \| [feat: add prometheus observability metrics for `dbpurge`](https://github.com/coder/coder/pull/21074) \| \| ✅ \| [feat: add rbac specificity for `dbpurge`](https://github.com/coder/coder/pull/21088) \|	2025-12-20 01:02:39 +11:00
Callum Styan	8ed1c1d372	perf: reduce calls to GetWorkspaceByAgentID in GetWorkspaceAgentByID (#21046 ) This PR piggy backs on the agent API cached workspace added in an earlier PR to provide a fast path for avoiding `GetWorkspaceByAgentID` calls in dbauthz's `GetWorkspaceAgentByID`. This query is not the most expensive, but has a significant call volume at ~16 million calls per week. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2025-12-10 14:03:24 -08:00
Callum Styan	27c3ec072e	perf: support fastpath in dbauthz GetLatestWorkspaceBuildByWorkspaceID (#21047 ) This PR piggy backs on the agent API cached workspace added in earlier PRs to provide a fast path for avoiding `GetWorkspaceByID` calls in `GetLatestWorkspaceBuildByWorkspaceID` via injection of the workspaces RBAC object into the context. We can do this from the `agentConnectionMonitor` easily since we already cache the workspace. --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2025-12-09 15:53:52 -08:00
Mathias Fredriksson	ad93262d07	fix(coderd/database/dbpurge): allow disabling AI Bridge retention with 0 (#21062 ) Previously setting AI Bridge retention to 0 would cause records to be deleted immediately since we didn't check for the zero value before calculating the deletion threshold. This adds a check for aibridgeRetention > 0 to skip deletion when retention is disabled, matching the pattern used for other retention settings (connection logs, audit logs, etc.). Also fixes the return type of DeleteOldAIBridgeRecords from int32 to int64 since COUNT(*) returns bigint in PostgreSQL. Refs #21055	2025-12-03 09:37:18 +00:00
Mathias Fredriksson	ff46917e62	feat: add retention config for `workspace_agent_logs` (#21039 ) Replace hardcoded 7-day retention for workspace agent logs with configurable retention from deployment settings. Defaults to 7d to preserve existing behavior. Depends on #21038 Updates #20743	2025-12-02 16:01:33 +00:00
Mathias Fredriksson	c85d79bcdb	feat(coderd/database/dbpurge): add retention for audit logs (#21025 ) Add configurable retention policy for audit logs. The DeleteOldAuditLogs query excludes deprecated connection events (connect, disconnect, open, close) which are handled separately by DeleteOldAuditLogConnectionEvents. Disabled (0) by default. Depends on #21021 Updates #20743	2025-12-02 16:50:09 +02:00
Mathias Fredriksson	9ebcca5b0d	feat(coderd/database/dbpurge): add retention for connection logs (#21022 ) Add `DeleteOldConnectionLogs` query and integrate it into the `dbpurge` routine. Retention is controlled by `--retention-connection-logs` flag. Disabled (0) by default. Depends on #21021 Updates #20743	2025-12-02 14:17:52 +00:00
Susana Ferreira	f8d9a8046f	feat: add notification warning alert to Tasks page (#20900 ) ## Problem Users may not realize that task notifications are disabled by default. To improve awareness, we show a warning alert on the Tasks page when all task notifications are disabled. Alert visibility logic: - Shows when all task notification templates (Task Working, Task Idle, Task Completed, Task Failed) are disabled - Can be dismissed by the user, which stores the dismissal in the user preferences API - If the user later enables any task notification in Account Settings, the dismissal state is cleared so the alert will show again if they disable all notifications in the future <img width="2980" height="1588" alt="Screenshot 2025-11-25 at 17 48 17" src="https://github.com/user-attachments/assets/316bf097-d9d2-4489-bc16-2987ba45f45c" /> ## Changes - Added a warning alert to the Tasks page when all task notifications are disabled - Introduced new `/users/{user}/preferences` endpoint to manage user preferences (stored in `user_configs` table) - Alert is dismissible and stores the dismissal state via the new user preferences API endpoint - Enabling any task notification in Account Settings clears the dismissal state via the preferences API - Added comprehensive Storybook stories for both TasksPage and NotificationsPage to test all alert visibility states and interactions Closes: https://github.com/coder/internal/issues/1089	2025-11-28 16:50:59 +00:00
Callum Styan	b0e8384b82	perf: reduce DB calls to `GetWorkspaceByAgentID` via caching workspace info (#20662 ) --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2025-11-25 14:45:05 -08:00
Mathias Fredriksson	37fc6646ad	perf(coderd/database): limit `GetLatestWorkspaceAppStatusByAppID` to 1 row (#20917 ) ## Description This PR fixes an issue where `GetLatestWorkspaceAppStatusesByAppID` returned an unbounded number of rows for a given app ID, which could cause performance issues for noisy or long-running AI tasks. ## Impact This change reduces database query overhead for workspace app status updates, particularly for busy AI tasks that update their status frequently. Previously, fetching the latest status would return all historical statuses, now it returns only the most recent one. Fixes #20862 --- 🤖 This change was written by Claude Sonnet 4.5 Thinking using [mux](https://github.com/coder/mux) and reviewed by a human 🏄🏻‍♂️	2025-11-25 16:56:42 +02:00
Danielle Maywood	82f525baf3	feat(coderd): add task prompt modification endpoint (#20811 ) This PR adds the backend implementation for modifying task prompts. Part of https://github.com/coder/internal/issues/1084 ## Changes - New `UpdateTaskPrompt` database query to update task prompts - New PATCH `/api/v2/tasks/{task}/prompt` endpoint ## Notes This is part 1 of a 2-part PR stack. The frontend UI will be added in a follow-up PR based on this branch (https://github.com/coder/coder/pull/20812). --- 🤖 PR was written by Claude Sonnet 4.5 Thinking using [Coder Mux](https://github.com/coder/cmux) and reviewed by a human 👩	2025-11-25 11:13:32 +00:00
Danielle Maywood	c12303f0b2	fix: allow agents to be created on dormant workspaces (#20909 ) Closes https://github.com/coder/coder/issues/20711 We now allow agents to be created on dormant workspaces. I've ran the test with and without the change. I've confirmed that - without the fix - it triggers the "rbac: unauthorized" error.	2025-11-25 06:24:33 +00:00
Steven Masley	cefe07d074	feat: purge expired api keys in dbpurge (#20863 ) closes https://github.com/coder/coder/issues/19889 This is in response to a migration in v2.27 that takes very long on deployments with large `api_key` tables.	2025-11-24 10:24:32 -06:00
Atif Ali	636408906f	chore(docs): standardize "AIBridge" to "AI Bridge" in documentation (#20831 )	2025-11-24 18:09:04 +05:00
Danny Kopping	5a7d4f69f6	feat: add configurable retention for aibridge (#20828 ) Closes https://github.com/coder/internal/issues/1134 --------- Signed-off-by: Danny Kopping <danny@coder.com>	2025-11-21 11:35:36 +02:00
Marcin Tojek	d004710a74	feat: add prebuild invalidation via last_invalidated_at timestamp (#20582 ) Updates #17917	2025-11-20 17:12:25 +01:00

1 2 3 4 5 ...

353 Commits