coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 13:08:25 +00:00

Author	SHA1	Message	Date
Mathias Fredriksson	96695edfed	fix(coderd/database): correct task pending status logic (#21886 ) Previously, tasks with pending provisioner jobs (not yet picked up) were incorrectly reported as "initializing". Refs #21887	2026-02-05 14:08:03 +02:00
Jon Ayers	22ece10a4a	feat: add healthy filter for workspace queries (#21743 ) Adds support for filtering workspaces by health status using healthy:true or healthy:false in the search query. This is done by changing `has-agent` to accept a list of statuses and aliasing `health:true` to `has-agent:connected` and `healthy:false` to `has-agent:timeout,disconnected`. Fixes #21623	2026-02-04 20:48:27 -06:00
Danielle Maywood	af0e171595	feat(coderd/agentapi): support terraform-defined subagent ids (#21837 ) Update `coderd/agentapi` to handle pre-created sub agents	2026-02-04 15:33:48 +00:00
Cian Johnston	91be688e39	chore(coderd/database): remove deprecated db2sdk.List(Lazy)? methods (#21902 ) Removes deprecated methods db2sdk.List and db2sdk.ListLazy.	2026-02-03 17:52:07 +00:00
Cian Johnston	353ebd9664	feat: add link for viewing raw build logs in workspace and template build jobs (#21727 ) * Adds support for parameter `format=text` in the following API routes: * `/api/v2/workspaceagents/:id/logs` * `/api/v2/workspacebuilds/:id/logs` * `/api/v2/templateversions/:id/logs` * `/api/v2/templateversions/:id/dry-run/:id/logs` * Adds links to view raw logs on the following pages: * Workspace build page * Template editor page * Template version page * Refactors existing log formatting in `cli/logs.go` to live in `codersdk`. 🤖 Generated with Claude Opus 4.5, reviewed by me. --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-02-03 09:45:23 +00:00
Mathias Fredriksson	f75cbab6ce	fix(coderd/database): prevent AcquireProvisionerJob from grabbing canceled jobs (#21852 ) The AcquireProvisionerJob query only checked started_at IS NULL, allowing it to acquire jobs that were canceled while pending (which have completed_at set but started_at still NULL). Added completed_at IS NULL check to the query to prevent this. Also fixed JobCompleteBuilder.Do() in dbfake to set started_at when completing jobs to match production behavior. Fixes coder/internal#1323	2026-02-03 10:42:17 +02:00
Zach	90aeea5649	fix: handle boundary usage across snapshots and flush races (#21805 ) Previously there were two issues that could cause incorrect boundary usage telemetry data. 1. Bad handling across snapshot intervals: After telemetry snapshot deleted the DB row, the next flush would INSERT the stale cumulative data (which included already-reported usage). This would then be overwritten by subsequent UPDATE flushes, causing the delta between the last snapshot and the reset to be lost (under-reporting usage). Additionally, if there was no new usage after the reset, the tracker would carry over all usage from the previous period into the next period (over-reporting usage). 2. Missed usage from a race condition: Track() calls between the first mutex unlock and second mutex lock in FlushToDB() were lost. The data wasn't included in the current flush (already snapshotted) and was wiped by the subsequent reset. This is likely low impact to overall usage numbers in the real world. Fix by tracking unique workspace/user deltas separately from cumulative values and always tracking delta allowed/denied requests. Deltas are used for INSERT (fresh start after reset), cumulative for UPDATE (accurate unique counts within a period). All counters reset atomically before the DB operation so Track() calls during the operation are preserved for the next flush.	2026-02-02 09:11:54 -07:00
Jake Howell	052bd114a4	fix: resolve missing users in `<UserCombobox />` (#21822 ) Closes #21044 This pull-request addresses an issue we were seeing where we would attempt to filter the `<UserCombobox />` by the users username or email not their username (which the rendered options would show). To highlight this I created three different users. Each with a username that did not contain their `email` or `name` and attempted to filter. Attempting to search for `John` wouldn't actually show the user as his username was `x`, and infact whereas a subset of users might be returned from the backend for having `john` in the `email` it would've been filtered by the frontend for not being in the `name` field. \| Name \| Username \| \| --- \| --- \| \| `Jake` \| `z` \| \| `Jeff` \| `y` \| \| `John` \| `x` \| \| Previously \| Now \| \| --- \| --- \| \| <img width="560" height="547" alt="OLD_USER_COMBOBOX" src="https://github.com/user-attachments/assets/a0567264-0034-42ac-aba0-95b05c4f92dd" /> \| <img width="580" height="548" alt="NEW_USER_COMBOBOX" src="https://github.com/user-attachments/assets/1aa0c942-d340-4b1c-8dde-b97879525bfb" /> \|	2026-02-03 00:13:41 +11:00
Danielle Maywood	37aecda165	feat(coderd/provisionerdserver): insert sub agent resource (#21699 ) Update provisionerdserver to handle the changes introduced to provisionerd in https://github.com/coder/coder/pull/21602 We now create a relationship between `workspace_agent_devcontainers` and `workspace_agents` with the newly created `subagent_id`.	2026-01-30 17:19:19 +00:00
Steven Masley	dfbd541cee	chore: move List util out of db2sdk to avoid circular imports (#21733 )	2026-01-28 13:07:53 -06:00
Spike Curtis	7090a1e205	chore: renumber duplicate migration 000411 (#21720 ) Fixes recent duplicate DB migration in #21607	2026-01-28 08:01:58 +04:00
Spike Curtis	f358a6db11	chore: convert tailnet tables to UNLOGGED for improved write performance (#21607 ) This migration converts all tailnet coordination tables to UNLOGGED: - `tailnet_coordinators` - `tailnet_peers` - `tailnet_tunnels` UNLOGGED tables skip Write-Ahead Log (WAL) writes, significantly improving performance for high-frequency updates like coordinator heartbeats and peer state changes. The trade-off is that UNLOGGED tables are truncated on crash recovery and are not replicated to standby servers. This is acceptable for these tables because the data is ephemeral: 1. Coordinators re-register on startup 2. Peers re-establish connections on reconnect 3. Tunnels are re-created based on current peer state Migration notes: - Child tables must be converted before the parent table because LOGGED child tables cannot reference UNLOGGED parent tables (but the reverse is allowed) - The down migration reverses the order: parent first, then children Fixes https://github.com/coder/coder/issues/21333	2026-01-28 07:12:32 +04:00
Zach	7dfa33b410	feat: add boundary usage tracking database schema and tracker skeleton (#21670 ) feat: add boundary usage telemetry database schema and RBAC Adds the foundation for tracking boundary usage telemetry across Coder replicas. This includes: - Database schema: `boundary_usage_stats` table with per-replica stats (unique workspaces, unique users, allowed/denied request counts) - Database queries: upsert stats, get aggregated summary, reset stats, delete by replica ID - RBAC: `boundary_usage` resource type with read/update/delete actions, accessible only via system `BoundaryUsageTracker` subject (not regular user roles) - Tracker skeleton + docs: stub implementation in `coderd/boundaryusage/` The tracker accumulates stats in memory and periodically flushes to the database. Stats are aggregated across replicas for telemetry reporting, then reset when a new reporting period begins. The tracker implementation and plumbing will be done in a subsequent commit/PR. --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 13:29:21 -07:00
George K	c352a51b22	fix(coderd): authorize workspace start/stop/delete by transition action (#21691 ) Use transition-specific actions when authorizing workspace build parameter inserts in the database layer so start/stop/delete do not require workspace.update. Related to: https://github.com/coder/internal/issues/1299	2026-01-27 09:08:12 -08:00
Mathias Fredriksson	25d7f27cdb	feat(coderd): add task log snapshot storage endpoint (#21644 ) This change adds a POST /workspaceagents/me/tasks/{task}/log-snapshot endpoint for agents to upload task conversation history during workspace shutdown. This allows users to view task logs even when the workspace is stopped. The endpoint accepts agentapi format payloads (typically last 10 messages, max 64KB), wraps them in a format envelope, and upserts to the task_snapshots table. Uses agent token auth and validates the task belongs to the agent's workspace. Closes coder/internal#1253	2026-01-27 11:09:24 +02:00
Spike Curtis	f47f89d997	chore: remove unused tailnet v1 tables and queries (#21646 ) Removes the legacy tailnet v1 API tables (`tailnet_clients`, `tailnet_agents`, `tailnet_client_subscriptions`) and their associated queries, triggers, and functions. These were superseded by the v2 tables (`tailnet_peers`, `tailnet_tunnels`) in migration 000168, and the v1 API code was removed in commit `d6154c4310`, but the database artifacts were never cleaned up. Changes: - New migration `000410_remove_tailnet_v1_tables` to drop the unused tables - Removed 11 unused queries from `tailnet.sql` - Removed associated manual wrapper methods in `dbauthz` and `dbmetrics` - ~930 lines deleted across 11 files	2026-01-26 14:27:17 +04:00
Callum Styan	e195856c43	perf: reduce pg_notify call volume by batching together agent metadata updates (#21330 ) --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-22 22:47:49 -08:00
George K	d29a168785	fix(coderd/rbac): reinstate deployment-wide workspace.share permission for owner role (#21620 ) The removal of that permission from the role broke valid use cases (e.g. a site owner user creating a workspace owned by a system account and then trying to share it with another user). The bulk of the PR is made up of the rollbacks of the previously introduced test updates necessitated by the removal. Related to: https://github.com/coder/internal/issues/1285	2026-01-22 08:12:15 -08:00
Mathias Fredriksson	97e8a5b093	fix(coderd): allow agent auth during workspace shutdown (#21538 ) Agents were losing authentication during workspace shutdown, causing shutdown scripts to fail. The auth query required agents to belong to the latest build, but during shutdown a `stop` build becomes latest while the `start` build's agents are still running. Modified the auth query to allow `start` build agents to authenticate temporarily during `stop` execution. The query allows auth when: - Agent's `start` build job succeeded - Latest build is `stop` with `pending`/`running` job status - Builds are adjacent (`stop` is `build_number + 1`) - Template versions match Auth closes once `stop` completes. Renamed `GetWorkspaceAgentAndLatestBuildByAuthToken` to `GetAuthenticatedWorkspaceAgentAndBuildByAuthToken` since it returns the agent's build (not always latest) during shutdown. Closes coder/internal#1249 Fixes #19467	2026-01-21 13:18:43 +00:00
Mathias Fredriksson	2132c53f28	feat(coderd/database): add schema for task pause/resume lifecycle (#21557 ) Creates migration 000409 with the database foundation for pausing and resuming task workspaces. The task_snapshots table stores conversation history (AgentAPI messages) so users can view task logs even when the workspace is stopped. Each task gets one snapshot, overwritten on each pause. Three new build_reason values (task_auto_pause, task_manual_pause, task_resume) let us distinguish task lifecycle events in telemetry and audit logs from regular workspace operations. Uses a regular table rather than UNLOGGED for snapshots. While UNLOGGED would be faster, losing snapshots on database crash creates user confusion (logs disappear until next pause). We can switch to UNLOGGED post-GA if write performance becomes a problem. Closes coder/internal#1250	2026-01-21 12:12:12 +02:00
Jake Howell	59b71f296f	feat: implement non-brittle `TestDBPurgeAuthorization` (#21442 ) Closes #21440 The `TestDBPurgeAuthorization` test was overfitting by calling each purge method individually, which reimplemented dbpurge logic in the test and created a maintenance burden. When new purge steps are added, they either need to be reflected in the test or there will be a testing blindspot. This change extracts the `doTick` closure into an exported `PurgeTick` function that returns an error, making the core purge logic testable. The test now calls `PurgeTick` directly to exercise the actual dbpurge behavior rather than reimplementing it. Retention values are configured to ensure all purge operations run, so we test RBAC permissions for all code paths. - Tests actual dbpurge behavior instead of reimplementing it - Automatically covers new purge steps when they're added - Still validates that all operations have proper RBAC permissions The test focuses on authorization (checking for RBAC errors) rather than verifying deletion behavior, which is already covered by other tests like `TestDeleteExpiredAPIKeys` and `TestDeleteOldAuditLogs`.	2026-01-21 11:27:01 +11:00
Cian Johnston	9776dc16bd	fix(coderd/database/dbmetrics): fix incorrect query label in GetWorkspaceAgentAndWorkspaceByID (#21576 ) Fixes an incorrect label.	2026-01-19 16:25:36 +00:00
Cian Johnston	08343a7a9f	perf: reduce number of queries made by /api/v2/workspaceagents/{id} (#21522 ) Relates to https://github.com/coder/internal/issues/1214 The `ExtractWorkspaceAgentParam` middleware ends up making 4 database queries to follow the chain of `WorkspaceAgent` -> `WorkspaceResource` -> `ProvisionerJob` -> `WorkspaceBuild` -- but then dropping all that hard work on the floor. The `api.workspaceAgent` handler that references this middleware then has to do all of that work again, plus one more query to get the related `User` so we can get the username. This pattern is also mirrored in `getDatabaseTerminal` but without the middleware. This PR: * Adds a new query `GetWorkspaceAgentAndWorkspaceByID` to fetch all this information at once to avoid the multiple round-trips, * Updates the existing usage of `GetWorkspaceAgentByID` to this new query instead, * Updates `ExtractWorkspaceAgentParam` to also store the workspace in the request context Dalibo: [0.63ms](https://explain.dalibo.com/plan/40bb597f3539gc6c)	2026-01-19 12:36:33 +00:00
blinkagent[bot]	d5296a4855	chore: add lint/migrations to detect hardcoded public schema (#21496 ) ## Problem Migration 000401 introduced a hardcoded `public.` schema qualifier which broke deployments using non-public schemas (see #21493). We need to prevent this from happening again. ## Solution Adds a new `lint/migrations` Make target that validates database migrations do not hardcode the `public` schema qualifier. Migrations should rely on `search_path` instead to support deployments using non-public schemas. ## Changes - Added `scripts/check_migrations_schema.sh` - a linter script that checks for `public.` references in migration files (excluding test fixtures) - Added `lint/migrations` target to the Makefile - Added `lint/migrations` to the main `lint` target so it runs in CI ## Testing - Verified the linter fails on current `main` (which has the hardcoded `public.` in migration 000401) - Verified the linter passes after applying the fix from #21493 ```bash # On main (fails) $ make lint/migrations ERROR: Migrations must not hardcode the 'public' schema. Use unqualified table names instead. # After fix (passes) $ make lint/migrations Migration schema references OK ``` ## Depends on - #21493 must be merged first (or this PR will fail CI until it is) --------- Signed-off-by: Danny Kopping <danny@coder.com> Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com> Co-authored-by: Danny Kopping <danny@coder.com>	2026-01-15 14:17:16 +02:00
Cian Johnston	5073493850	feat(coderd/database/dbmetrics): add query_counts_total metric (#21506 ) Adds a new Prometheus metric `coderd_db_query_counts_total` that tracks the total number of queries by route, method, and query name. This is aimed at helping us track down potential optimization candidates for HTTP handlers that may trigger a number of queries. It is expected to be used alongside `coderd_api_requests_processed_total` for correlation. Depends upon new middleware introduced in https://github.com/coder/coder/pull/21498 Relates to https://github.com/coder/internal/issues/1214	2026-01-15 10:58:56 +00:00
George K	0712faef4f	feat(enterprise): implement organization "disable workspace sharing" option (#21376 ) Adds a per-organization setting to disable workspace sharing. When enabled, all existing workspace ACLs in the organization are cleared and the workspace ACL mutation API endpoints return `403 Forbidden`. This complements the existing site-wide `--disable-workspace-sharing` flag by providing more granular control at the organization level. Closes https://github.com/coder/internal/issues/1073 (part 2) --------- Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>	2026-01-14 09:47:50 -08:00
blinkagent[bot]	b3a81be1aa	fix(coderd/database): remove hardcoded public schema from migration 000401 (#21493 )	2026-01-14 05:40:30 +02:00
George K	cc2efe9e1f	feat(coderd/rbac): make organization-member a per-org system custom role (#21359 ) Migrated the built-in organization-member role to DB storage so it can be customized per org. Closes https://github.com/coder/internal/issues/1073 (part 1)	2026-01-12 18:19:19 -08:00
Steven Masley	89f4d60e7b	chore: remove experiment "terraform-directory-reuse" (#21397 ) Experiment is no longer required, the new method will be released without an experiment and without a toggle Main PR is: https://github.com/coder/coder/pull/21398	2026-01-09 11:13:16 -06:00
Cian Johnston	b116d22c5f	chore: manage tool versions in go.mod (#21455 ) Go 1.24 adds [tool dependencies](https://go.dev/doc/modules/managing-dependencies#tools). This allows us to track versions of tools in our `go.mod` instead of sprinkling various `go run` commands throughout our codebase. NOTE: there are still various hard-coded `go install` commands in our dogfood Dockerfile. As that list is likely severely outdated, will leave that for a separate PR.	2026-01-08 16:25:28 +00:00
Spike Curtis	bddb808b25	chore: arrange imports in a standard way (#21452 ) Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example: ``` import ( "context" "time" "github.com/prometheus/client_golang/prometheus" "golang.org/x/xerrors" "gopkg.in/natefinch/lumberjack.v2" "cdr.dev/slog/v3" "github.com/coder/coder/v2/codersdk/agentsdk" "github.com/coder/serpent" ) ``` 3 groups: standard library, 3rd partly libs, Coder libs. This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.	2026-01-08 15:24:11 +04:00
Cian Johnston	0f446f99dd	feat(cli): add logs cmd (#21430 ) This PR adds a command to view the provisioner and agent logs for a given workspace. Note: I did investigate using the existing `cliui` methods to tail the logs but they are tailored to a very specific use-case. Other changes: - Adds `Agents` to `dbfake.WorkspaceResponse` - Adds methods to generate provisioner and agent logs in `dbgen` --------- Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>	2026-01-08 09:58:10 +00:00
Spike Curtis	49b34a716a	fix: fix slog to always use array of Fields (#21426 ) Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder). It also updates dependencies that also use slog and were updated. I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule. Other dependencies, I pushed new tags.	2026-01-08 10:29:41 +04:00
George K	e10fceb23c	fix(coderd/database): allow same custom role name for different orgs (#21312 ) Previously the `idx_custom_roles_name_lower` index prevented that. A check constraint was also added to ensure the `organization_id` column cannot be set to the all-zero UUID.	2026-01-05 07:43:08 -08:00
Jake Howell	ea00e72063	feat: add rbac specificity for `dbpurge` (#21088 ) Related to [`internal#1139`](https://github.com/coder/internal/issues/1139) Continuation of #21074 This implements some RBAC role specificity for `dbpurge`, ensuring that we follow the least-privileged model for removing data from the database. It is specified as following. ```go Site: rbac.Permissions(map[string][]policy.Action{ // DeleteOldWorkspaceAgentLogs // DeleteOldWorkspaceAgentStats // DeleteOldProvisionerDaemons // DeleteOldTelemetryLocks // DeleteOldAuditLogConnectionEvents // DeleteOldConnectionLogs rbac.ResourceSystem.Type: {policy.ActionDelete}, // DeleteOldNotificationMessages rbac.ResourceNotificationMessage.Type: {policy.ActionDelete}, // ExpirePrebuildsAPIKeys // DeleteExpiredAPIKeys rbac.ResourceApiKey.Type: {policy.ActionDelete}, // DeleteOldAIBridgeRecords rbac.ResourceAibridgeInterception.Type: {policy.ActionDelete}, }), ``` \| Position \| Pull-request \| \| -------- \| ------------ \| \| \| [feat: add prometheus observability metrics for `dbpurge`](https://github.com/coder/coder/pull/21074) \| \| ✅ \| [feat: add rbac specificity for `dbpurge`](https://github.com/coder/coder/pull/21088) \|	2025-12-20 01:02:39 +11:00
Jake Howell	00793cc0b5	feat: add prometheus observability metrics for `dbpurge` (#21074 ) Related to [`internal#1139`](https://github.com/coder/internal/issues/1139) This implements some prometheus metrics for records being removed from the database. Currently we're tracking the following fields being removed from the DB by this. They're viewable in the `/api/v2/debug/metrics` endpoint. * `expired_api_keys` * `aibridge_records` * `connection_logs` * `duration` ``` # HELP coderd_dbpurge_iteration_duration_seconds Duration of each dbpurge iteration in seconds. # TYPE coderd_dbpurge_iteration_duration_seconds histogram coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="1"} 1 coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="5"} 1 coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="10"} 1 coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="30"} 1 coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="60"} 1 coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="300"} 1 coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="600"} 1 coderd_dbpurge_iteration_duration_seconds_bucket{success="true",le="+Inf"} 1 coderd_dbpurge_iteration_duration_seconds_sum{success="true"} 0.014787814 coderd_dbpurge_iteration_duration_seconds_count{success="true"} 1 # HELP coderd_dbpurge_records_purged_total Total number of records purged by type. # TYPE coderd_dbpurge_records_purged_total counter coderd_dbpurge_records_purged_total{record_type="aibridge_records"} 0 coderd_dbpurge_records_purged_total{record_type="audit_logs"} 0 coderd_dbpurge_records_purged_total{record_type="connection_logs"} 0 coderd_dbpurge_records_purged_total{record_type="expired_api_keys"} 0 coderd_dbpurge_records_purged_total{record_type="workspace_agent_logs"} 0 ``` \| Position \| Pull-request \| \| -------- \| ------------ \| \| ✅ \| [feat: add prometheus observability metrics for `dbpurge`](https://github.com/coder/coder/pull/21074) \| \| \| [feat: add rbac specificity for `dbpurge`](https://github.com/coder/coder/pull/21088) \|	2025-12-20 00:20:57 +11:00
Spike Curtis	bd753d9cb9	fix: mark users seen when activating on login (#21305 ) fixes #21303 Update user last_seen_at when we mark them active on login. This prevents a narrow race where they can be re-marked dormant and fail to log in.	2025-12-17 16:49:40 +04:00
Mathias Fredriksson	dac822b7f4	refactor: remove deprecated AITaskPromptParameterName constant (#21023 ) This removes the deprecated AITaskPromptParameterName constant and all backward compatibility code that was added for v2.28. - Remove AITaskPromptParameterName constant from codersdk/aitasks.go - Remove backward compatibility code in coderd/aitasks.go that populated the "AI Prompt" parameter for templates that defined it - Remove the backward compatibility test (OK AIPromptBackCompat) - Update dbfake to no longer set the AI Prompt parameter - Remove AITaskPromptParameterName from frontend TypeScript types - Remove preset prompt read-only feature from TaskPrompt component - Update docs to reflect that pre-2.28 definition is no longer supported Task prompts are now exclusively stored in the tasks.prompt database column, as introduced in the migration that added the tasks table.	2025-12-16 15:14:59 +00:00
George K	103967ed02	feat: add sharing info to /workspaces endpoint (#21049 ) closes: https://github.com/coder/internal/issues/858 Similar to https://github.com/coder/coder/pull/19375, this one uses system permissions for fetching actual user and group data. Modifies the `workspaces_expanded` view to fetch the required data; this way it's made available to all code paths that make use of it. Also fixes a bug in a test helper function that can result in `null` being saved to the DB for `user_acl` or `group_acl` and break tests; a defensive check constraint that prevents this is worth a PR, e.g: `ALTER TABLE workspaces ADD CONSTRAINT group_acl_is_object CHECK (jsonb_typeof(group_acl) = 'object');` Also adds missing `OwnerName` in `ConvertWorkspaceRows`.	2025-12-15 08:42:08 -08:00
Mathias Fredriksson	761dd55ee8	fix(coderd/database): sort template version variables and fix test flake (#21233 ) Previously the GetTemplateVersionVariables query did not sort output, relying on PostgreSQL on-disk ordering which is undeterministic. Variables are now sorted by name because there is no alternative for ordering. Tests were adjusted to accommodate the new ordering, previously they relied on data being written to disk in insert order.	2025-12-12 11:41:46 +00:00
Danielle Maywood	f45a179181	test: move context to after db creation (#21224 ) Closes https://github.com/coder/internal/issues/1040 We move the context to just before it is used to avoid the scenario where NewDB takes a while to spin up and runs up the context to the deadline.	2025-12-11 21:51:16 +00:00
Callum Styan	8ed1c1d372	perf: reduce calls to GetWorkspaceByAgentID in GetWorkspaceAgentByID (#21046 ) This PR piggy backs on the agent API cached workspace added in an earlier PR to provide a fast path for avoiding `GetWorkspaceByAgentID` calls in dbauthz's `GetWorkspaceAgentByID`. This query is not the most expensive, but has a significant call volume at ~16 million calls per week. Signed-off-by: Callum Styan <callumstyan@gmail.com>	2025-12-10 14:03:24 -08:00
George K	da71e546bb	chore: fix test errors on newer debian-based systems due to deprecated TZ (#21115 ) It appears on newer Debian systems `Canada/Newfoundland` TZ is not present and `America/St_Johns` should be used instead. Coder tests use a docker PG image where `Canada/Newfoundland` is still supported: ``` $ docker run --rm -it us-docker.pkg.dev/coder-v2-images-public/public/postgres:17 bash root@ca99e82721dc:/# ls -l /usr/share/zoneinfo/Canada/Newfoundland lrwxrwxrwx 1 root root 19 Mar 26 2025 /usr/share/zoneinfo/Canada/Newfoundland -> ../America/St_Johns ``` However, if a local PG instance is running on a Debian Trixie host, coder test will use it and error out due to the zone being unavailable: ``` $ docker run --rm -it debian:trixie bash root@f285092767e4:/# ls -l /usr/share/zoneinfo/Canada/Newfoundland ls: cannot access '/usr/share/zoneinfo/Canada/Newfoundland': No such file or directory root@f285092767e4:/# ls -l /usr/share/zoneinfo/America/St_Johns -rw-r--r-- 1 root root 3655 Aug 24 20:12 /usr/share/zoneinfo/America/St_Johns ``` ... which causes the tests to error out: ``` $ go test ./enterprise/coderd --- FAIL: TestWorkspaceTemplateParamsChange (0.13s) workspaces_test.go:3097: TestWorkspaceTagsTerraform: using cached terraform providers workspaces_test.go:3097: Set TF_CLI_CONFIG_FILE=/home/geo/.cache/coderv2-test/terraform_workspace_tags_test/a28ed341dee8/terraform.rc coderdenttest.go:84: Error Trace: /home/geo/coder/coderd/database/dbtestutil/db.go:161 /home/geo/coder/coderd/database/dbtestutil/db.go:122 /home/geo/coder/coderd/coderdtest/coderdtest.go:270 /home/geo/coder/enterprise/coderd/coderdenttest/coderdenttest.go:105 /home/geo/coder/enterprise/coderd/coderdenttest/coderdenttest.go:84 /home/geo/coder/enterprise/coderd/coderdenttest/coderdenttest.go:84 /home/geo/coder/enterprise/coderd/workspaces_test.go:3103 Error: Received unexpected error: pq: invalid value for parameter "TimeZone": "Canada/Newfoundland" Test: TestWorkspaceTemplateParamsChange Messages: failed to set timezone for database ... ``` This commit replaces the problematic TZ with the canonical one.	2025-12-10 08:09:13 -08:00
Callum Styan	27c3ec072e	perf: support fastpath in dbauthz GetLatestWorkspaceBuildByWorkspaceID (#21047 ) This PR piggy backs on the agent API cached workspace added in earlier PRs to provide a fast path for avoiding `GetWorkspaceByID` calls in `GetLatestWorkspaceBuildByWorkspaceID` via injection of the workspaces RBAC object into the context. We can do this from the `agentConnectionMonitor` easily since we already cache the workspace. --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2025-12-09 15:53:52 -08:00
Callum Styan	a59a84b2a7	perf: optimize GetTemplateAppInsightsByTemplate by pre-filtering on start/end times (#20669 ) In this PR we're optimizing the `GetTemplateAppInsightsByTemplate` query by pre-filtering out apps which do not have an active session during the start/end time window. --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2025-12-09 15:21:16 -08:00
Callum Styan	6abb889fab	perf: optimize GetDeploymentWorkspaceAgentStats by eliminating 2nd select (#21112 ) Tracking issue here: https://github.com/coder/internal/issues/1009 To summarize, the current version of this query selects from `workspace_agent_stats` twice. The expensive portion of this query is the bitmap heap scan we have to do for each of these selects. We can easily cut the cost of this query by 40-50% by cutting this down to a single select, and using those rows for both sets of calculations. Eliminating the heap scan itself would require a follow up PR to introduce a new index. Blink helped with the rewrite of the query. The current plan looks like this: ``` Nested Loop (cost=6101.64..6101.69 rows=1 width=64) (actual time=11.782..11.787 rows=1 loops=1) -> Aggregate (cost=2996.17..2996.19 rows=1 width=32) (actual time=3.356..3.357 rows=1 loops=1) -> Bitmap Heap Scan on workspace_agent_stats (cost=54.80..2992.86 rows=440 width=24) (actu al time=0.346..2.927 rows=818 loops=1) Recheck Cond: (created_at > (now() - '00:15:00'::interval)) Filter: (connection_median_latency_ms > '0'::double precision) Rows Removed by Filter: 1070 Heap Blocks: exact=486 -> Bitmap Index Scan on idx_agent_stats_created_at (cost=0.00..54.69 rows=1368 width =0) (actual time=0.241..0.241 rows=1888 loops=1) Index Cond: (created_at > (now() - '00:15:00'::interval)) -> Aggregate (cost=3105.47..3105.49 rows=1 width=32) (actual time=8.418..8.420 rows=1 loops=1) -> Subquery Scan on a (cost=3060.95..3105.39 rows=7 width=32) (actual time=7.851..8.394 ro ws=63 loops=1) Filter: (a.rn = 1) -> WindowAgg (cost=3060.95..3088.29 rows=1368 width=209) (actual time=7.850..8.382 r ows=63 loops=1) Run Condition: (row_number() OVER (?) <= 1) -> Sort (cost=3060.93..3064.35 rows=1368 width=56) (actual time=7.836..8.036 r ows=1888 loops=1) Sort Key: workspace_agent_stats_1.agent_id, workspace_agent_stats_1.create d_at DESC Sort Method: quicksort Memory: 181kB -> Bitmap Heap Scan on workspace_agent_stats workspace_agent_stats_1 (co st=55.03..2989.67 rows=1368 width=56) (actual time=0.388..2.096 rows=1888 loops=1) Recheck Cond: (created_at > (now() - '00:15:00'::interval)) Heap Blocks: exact=486 -> Bitmap Index Scan on idx_agent_stats_created_at (cost=0.00..54. 69 rows=1368 width=0) (actual time=0.295..0.295 rows=1888 loops=1) Index Cond: (created_at > (now() - '00:15:00'::interval)) Planning Time: 2.350 ms Execution Time: 13.152 ms (24 rows) ``` The new plan looks like this ``` Aggregate (cost=2966.96..2966.98 rows=1 width=64) (actual time=3.812..3.814 rows=1 loops=1) -> WindowAgg (cost=2891.96..2916.94 rows=1250 width=88) (actual time=2.696..3.412 rows=1890 loop s=1) -> Sort (cost=2891.94..2895.06 rows=1250 width=80) (actual time=2.686..2.780 rows=1890 loo ps=1) Sort Key: workspace_agent_stats.agent_id, workspace_agent_stats.created_at DESC Sort Method: quicksort Memory: 226kB -> Bitmap Heap Scan on workspace_agent_stats (cost=50.11..2827.64 rows=1250 width=80 ) (actual time=0.218..1.551 rows=1890 loops=1) Recheck Cond: (created_at > (now() - '00:15:00'::interval)) Heap Blocks: exact=474 -> Bitmap Index Scan on idx_agent_stats_created_at (cost=0.00..49.80 rows=1250 width=0) (actual time=0.146..0.147 rows=1890 loops=1) Index Cond: (created_at > (now() - '00:15:00'::interval)) Planning Time: 0.534 ms Execution Time: 3.969 ms (12 rows) ``` If we compare the results of the query they're similar enough that any differences can be attributed to slightly different timestamps for `now()` in the version of the query I am using to generate results for comparison: ``` workspace_rx_bytes \| workspace_tx_bytes \| workspace_connection_latency_50 \| workspace_connection_latency_95 \| session_count_vscode \| session_count_ssh \| session_count_jetbrains \| session_count_reconnecting_pty --------------------+--------------------+---------------------------------+---------------------------------+----------------------+-------------------+-------------------------+-------------------------------- 15263563 \| 74555854 \| 47.933 \| 250.5522 \| 239 \| 59 \| 3 \| 3 (1 row) workspace_rx_bytes \| workspace_tx_bytes \| workspace_connection_latency_50 \| workspace_connection_latency_95 \| session_count_vscode \| session_count_ssh \| session_count_jetbrains \| session_count_reconnecting_pty --------------------+--------------------+---------------------------------+---------------------------------+----------------------+-------------------+-------------------------+-------------------------------- 15295819 \| 74598410 \| 47.933 \| 250.5522 \| 239 \| 59 \| 3 \| 3 ``` --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2025-12-09 15:19:55 -08:00
George K	4379230a27	feat: add deployment-wide option to disable workspace sharing (#21172 ) Adds `--disable-workspace-sharing` option. Workspace sharing is disabled by not including user and group ACLs in the workspace RBAC object, which prevents ACL-based authz. Closes https://github.com/coder/internal/issues/1072 The commit also adds saving of workspace user/group ACLs in the test DB data generator.	2025-12-09 08:13:09 -08:00
Mathias Fredriksson	cfdd4a9b88	perf(coderd/database): add index on workspace_app_statuses.app_id (#21099 )	2025-12-04 17:56:13 +02:00
Mathias Fredriksson	ad93262d07	fix(coderd/database/dbpurge): allow disabling AI Bridge retention with 0 (#21062 ) Previously setting AI Bridge retention to 0 would cause records to be deleted immediately since we didn't check for the zero value before calculating the deletion threshold. This adds a check for aibridgeRetention > 0 to skip deletion when retention is disabled, matching the pattern used for other retention settings (connection logs, audit logs, etc.). Also fixes the return type of DeleteOldAIBridgeRecords from int32 to int64 since COUNT(*) returns bigint in PostgreSQL. Refs #21055	2025-12-03 09:37:18 +00:00
Mathias Fredriksson	ff46917e62	feat: add retention config for `workspace_agent_logs` (#21039 ) Replace hardcoded 7-day retention for workspace agent logs with configurable retention from deployment settings. Defaults to 7d to preserve existing behavior. Depends on #21038 Updates #20743	2025-12-02 16:01:33 +00:00

1 2 3 4 5 ...

1294 Commits