Update provisionerdserver to handle the changes introduced to
provisionerd in https://github.com/coder/coder/pull/21602
We now create a relationship between `workspace_agent_devcontainers` and
`workspace_agents` with the newly created `subagent_id`.
This migration converts all tailnet coordination tables to UNLOGGED:
- `tailnet_coordinators`
- `tailnet_peers`
- `tailnet_tunnels`
UNLOGGED tables skip Write-Ahead Log (WAL) writes, significantly
improving performance for high-frequency updates like coordinator
heartbeats and peer state changes.
The trade-off is that UNLOGGED tables are truncated on crash recovery
and are not replicated to standby servers. This is acceptable for these
tables because the data is ephemeral:
1. Coordinators re-register on startup
2. Peers re-establish connections on reconnect
3. Tunnels are re-created based on current peer state
**Migration notes:**
- Child tables must be converted before the parent table because LOGGED
child tables cannot reference UNLOGGED parent tables (but the reverse is
allowed)
- The down migration reverses the order: parent first, then children
Fixes https://github.com/coder/coder/issues/21333
feat: add boundary usage telemetry database schema and RBAC
Adds the foundation for tracking boundary usage telemetry across Coder
replicas. This includes:
- Database schema: `boundary_usage_stats` table with per-replica stats
(unique workspaces, unique users, allowed/denied request counts)
- Database queries: upsert stats, get aggregated summary, reset stats,
delete by replica ID
- RBAC: `boundary_usage` resource type with read/update/delete actions,
accessible only via system `BoundaryUsageTracker` subject (not regular
user roles)
- Tracker skeleton + docs: stub implementation in `coderd/boundaryusage/`
The tracker accumulates stats in memory and periodically flushes to the
database. Stats are aggregated across replicas for telemetry reporting,
then reset when a new reporting period begins. The tracker implementation
and plumbing will be done in a subsequent commit/PR.
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Removes the legacy tailnet v1 API tables (`tailnet_clients`, `tailnet_agents`, `tailnet_client_subscriptions`) and their associated queries, triggers, and functions. These were superseded by the v2 tables (`tailnet_peers`, `tailnet_tunnels`) in migration 000168, and the v1 API code was removed in commit d6154c4310, but the database artifacts were never cleaned up.
**Changes:**
- New migration `000410_remove_tailnet_v1_tables` to drop the unused tables
- Removed 11 unused queries from `tailnet.sql`
- Removed associated manual wrapper methods in `dbauthz` and `dbmetrics`
- ~930 lines deleted across 11 files
Creates migration 000409 with the database foundation for pausing and
resuming task workspaces.
The task_snapshots table stores conversation history (AgentAPI messages)
so users can view task logs even when the workspace is stopped. Each task
gets one snapshot, overwritten on each pause.
Three new build_reason values (task_auto_pause, task_manual_pause,
task_resume) let us distinguish task lifecycle events in telemetry and
audit logs from regular workspace operations.
Uses a regular table rather than UNLOGGED for snapshots. While UNLOGGED
would be faster, losing snapshots on database crash creates user confusion
(logs disappear until next pause). We can switch to UNLOGGED post-GA if
write performance becomes a problem.
Closescoder/internal#1250
## Problem
Migration 000401 introduced a hardcoded `public.` schema qualifier which
broke deployments using non-public schemas (see #21493). We need to
prevent this from happening again.
## Solution
Adds a new `lint/migrations` Make target that validates database
migrations do not hardcode the `public` schema qualifier. Migrations
should rely on `search_path` instead to support deployments using
non-public schemas.
## Changes
- Added `scripts/check_migrations_schema.sh` - a linter script that
checks for `public.` references in migration files (excluding test
fixtures)
- Added `lint/migrations` target to the Makefile
- Added `lint/migrations` to the main `lint` target so it runs in CI
## Testing
- Verified the linter **fails** on current `main` (which has the
hardcoded `public.` in migration 000401)
- Verified the linter **passes** after applying the fix from #21493
```bash
# On main (fails)
$ make lint/migrations
ERROR: Migrations must not hardcode the 'public' schema. Use unqualified table names instead.
# After fix (passes)
$ make lint/migrations
Migration schema references OK
```
## Depends on
- #21493 must be merged first (or this PR will fail CI until it is)
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Danny Kopping <danny@coder.com>
Previously the `idx_custom_roles_name_lower` index prevented that.
A check constraint was also added to ensure the `organization_id` column cannot be set to the all-zero UUID.
closes: https://github.com/coder/internal/issues/858
Similar to https://github.com/coder/coder/pull/19375, this one uses
system permissions for fetching actual user and group data.
Modifies the `workspaces_expanded` view to fetch the required data; this way it's made available to all code paths that make use of it.
Also fixes a bug in a test helper function that can result in `null` being saved to the DB for `user_acl` or `group_acl` and break tests; a defensive check constraint that prevents this is worth a PR, e.g:
`ALTER TABLE workspaces
ADD CONSTRAINT group_acl_is_object CHECK (jsonb_typeof(group_acl) = 'object');`
Also adds missing `OwnerName` in `ConvertWorkspaceRows`.
## Context
GetWorkspaceAgentByInstanceID has a suboptimal plan. Even though it is
designed to fetch a small subset of records, there are no corresponding
indexes and that query results in full table scan:
Query:
```
SELECT id, auth_instance_id FROM workspace_agents
where auth_instance_id='i-013c2b96b6441648a' and deleted=FALSE;
```
Plan:
```
------------------------------------------------------------------------------------------------------------------
Seq Scan on workspace_agents (cost=0.00..222325.48 rows=2 width=36) (actual time=0.012..234.152 rows=4 loops=1)
Filter: ((NOT deleted) AND ((auth_instance_id)::text = 'i-013c2b96b6441648a'::text))
Rows Removed by Filter: 302276
Planning Time: 0.173 ms
Execution Time: 234.169 ms
```
After adding the index, the plan improves drastically.
Updated plan:
```
Bitmap Heap Scan on workspace_agents (cost=4.44..12.32 rows=2 width=36) (actual time=0.019..0.019 rows=0 loops=1)
Recheck Cond: (((auth_instance_id)::text = 'i-013c2b96b6441648a'::text) AND (NOT deleted))
-> Bitmap Index Scan on workspace_agents_auth_instance_id_deleted_idx (cost=0.00..4.44 rows=2 width=0) (actual time=0.013..0.014 rows=0 loops=1)
Index Cond: (((auth_instance_id)::text = 'i-013c2b96b6441648a'::text) AND (deleted = false))
Planning Time: 0.388 ms
Execution Time: 0.044 ms
```
## Changes
* add an index to optimize this query
## Testing
* ran the queries manually against prod and test DBs
* ran `./scripts/develop.sh`, connected to the local PostgreSQL
instance, inspected the indexes to make sure new index is there:
```
Indexes:
"workspace_agents_pkey" PRIMARY KEY, btree (id)
// NEW INDEX CREATED SUCCESSFULLY [comment is mine]
"workspace_agents_auth_instance_id_deleted_idx" btree (auth_instance_id, deleted)
"workspace_agents_auth_token_idx" btree (auth_token)
"workspace_agents_resource_id_idx" btree (resource_id)
```
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: Danny Kopping <danny@coder.com>
closes https://github.com/coder/coder/issues/20899
This is in response to a migration in v2.27 that takes very long on
deployments with large `api_keys` tables.
NOTE: The optimization causes the _up_ migration to delete old data
(keys that expired more than 7 days ago). The _down_ migration won't
resurrect the deleted data.
## Problem
Tasks currently only expose a machine-friendly name field (e.g.
`task-python-debug-a1b2`), but this value is primarily an identifier
rather than a clean, descriptive label. We need a separate
display-friendly name for use in the UI.
This PR introduces a new `display_name` field and updates the task-name
generation flow. The Claude system prompt was updated to return valid
JSON with both `name` and `display_name`. The name generation logic
follows a fallback chain (Anthropic > prompt sanitization > random
fallback). To make task names more closely resemble their display names,
the legacy `task-` prefix has been removed. For context, PR
https://github.com/coder/coder/pull/20834 introduced a small Task icon
to the workspace list to help identify workspaces associated to tasks.
## Changes
- Database migration: Added `display_name` column to tasks table
- Updated system prompt to generate both task name and display name as
valid JSON
- Task name generation now follows a fallback chain: Anthropic > prompt
sanitization > random fallback
- Removed `task-` prefix from task names to allow more descriptive names
- Note: PR https://github.com/coder/coder/pull/20834 adds a Task icon to
workspaces in the workspace list to distinguish task-created workspaces
**Note:** UI changes will be addressed in a follow-up PR
Related to: https://github.com/coder/coder/issues/20801
This change restructures the `tasks_with_status` view query to:
- Improve debuggability by adding a `status_debug` column to better
understand the outcome
- Reduce clutter from `bool_or`, `bool_and` which are aggregate
functions that did not actually have serve a purpose (each join is 0-1
rows)
- Improve agent lifecycle state coverage, `start_timeout` and
`start_error` were omitted
- These states are easy to trigger even in a perfectly functioning
workspace/task so we now rely on app health to report whether or not
there was an issue
- Mark canceling and canceled workspace build jobs as error state
- Agent stop states were implicitly `unknown`, now there are explicit (I
initially considered `error`, could go either way)
For experimental and dogfood purposes, this adds the ability to opt in a single template.
Leaving the rest of the templates as is.
For GA, this setting might be removed or changed.
Relates to https://github.com/coder/internal/issues/1098
Currently task notifications are incredibly noisy. We should disable
them by default for the upcoming release whilst we iron them out.
Relates to
https://github.com/coder/coder/pull/20431/files#diff-9cfc826a6ce7e77d977b2025482474dd263d12965b2a94479a74c7f1d872b782
If the workspace relating to a task was deleted, most of the
workspace-related fields in `taskFromDBTaskAndWorkspace` will be
zero-valued. However, we can still get information relating to the owner
so that "created by" shows up correctly in the UI.
Updates the `tasks_with_status` view with a join on `visible_users` to
get owner-related info.
- Adds a new table to keep track of which payloads have already been
reported since we only report for the last clock hour
- Adds a query to gather and aggregate all the data by
provider/model/client
Relates to https://github.com/coder/coder-telemetry-server/issues/27
Add API key allow list to the SDK
This PR adds an allow list to API keys in the SDK. The allow list is a list of targets that the API key is allowed to access. If the allow list is empty, a default allow list with a single entry that allows access to all resources is created.
The changes include:
- Adding a default allow list when generating an API key if none is provided
- Adding allow list to the API key response in the SDK
- Converting database allow list entries to SDK format in the API response
- Adding tests to verify the default allow list behavior
Fixes#19854
This PR uses the same sha256 hashing technique as we use for APIKeys. So
now all randomly generated secrets will be hashed with sha256 for
consistency.
This is a breaking change for the oauth tokens. Since oauth is only
allowed for dev builds and experimental, this is ok.
- Adds FK from `aibridge_interceptions.initiator_id` to `users.id`
- This is enforced by deleting any rows that don't have any users. Since
this is an experimental feature AND coder never deletes user rows I
think this is acceptable.
- Adds `name` as a property on `codersdk.MinimalUser`
- This matches the `visible_users` view in the database. I'm unsure why
`name` wasn't already included given that `username` is.
- Adds a new `initiator` field to `codersdk.AIBridgeInterception` which
contains `codersdk.MinimalUser` (ID, username, name, avatar URL)
- Removes `initiator_id` from `codersdk.AIBridgeInterception`
- Should be fine since we're still in early access
This change ensures task names are unique per user the same way we do
for workspaces. This ensures we don't create tasks that are impossible
to start due to another task being named the same creating a workspace
name conflict.
Updates coder/internal#948
Supersedes coder/coder#20212
This change updates the `task_workspace_apps` table structure for
improved linking to workspace builds and adds queries to manage tasks
and a view to expose task status.
Updates coder/internal#948
Supersedes coder/coder#20212
Supersedes coder/coder#19773
# Add API Key Scope Wildcards
This PR adds wildcard API key scopes (`resource:*`) for all RBAC resources to ensure every resource has a matching wildcard value. It also adds all individual `resource:action` scopes to the API documentation and TypeScript definitions.
The changes include:
- Adding a new database migration (000377) that adds wildcard API key scopes
- Updating the API documentation to include all available scopes
- Enhancing the scope generation scripts to include all resource wildcards
- Updating the TypeScript definitions to match the expanded scope list
These changes make creating API keys with comprehensive permissions for specific resource types easier.
## Description
Send a notification to the workspace owner when an AI task’s app state
becomes `Working` or `Idle`.
An AI task is identified by a workspace build with `HasAITask = true`
and `AITaskSidebarAppID` matching the agent app’s ID.
## Changes
* Add `TemplateTaskWorking` notification template.
* Add `TemplateTaskIdle` notification template.
* Add `GetLatestWorkspaceAppStatusesByAppID` SQL query to get the
workspace app statuses ordered by latest first.
* Update `PATCH /workspaceagents/me/app-status` to enqueue:
* `TemplateTaskWorking` when state transitions to `working`
* `TemplateTaskIdle` when state transitions to `idle`
* Notification labels include:
* `task`: task initial prompt
* `workspace`: workspace name
* Notification dedupe: include a minute-bucketed timestamp (UTC
truncated to the minute) in the enqueue data to allow identical content
to resend within the same day (but not more than once per minute).
Closes: https://github.com/coder/coder/issues/19776
# Add Composite API Key Scopes
This PR adds high-level composite API key scopes to simplify token creation with common permission sets:
- `coder:workspaces.create` - Create and update workspaces
- `coder:workspaces.operate` - Read and update workspaces
- `coder:workspaces.delete` - Read and delete workspaces
- `coder:workspaces.access` - Read, SSH, and connect to workspace applications
- `coder:templates.build` - Read templates and create/read files
- `coder:templates.author` - Full template management with insights
- `coder:apikeys.manage_self` - Manage your own API keys
These composite scopes are persisted in the database and expanded during authorization, providing a more intuitive way to grant permissions compared to the granular resource:action scopes.
# Canonicalize API Key Scopes
This PR introduces canonical API key scopes with a `coder:` namespace prefix to avoid collisions with low-level resource:action names. It:
1. Renames special API key scopes in the database:
- `all` → `coder:all`
- `application_connect` → `coder:application_connect`
2. Adds support for a new `scopes` field in the API key creation request, allowing multiple scopes to be specified while maintaining backward compatibility with the singular `scope` field.
3. Updates the API documentation to reflect these changes, including the new endpoint for listing public API key scopes.
4. Ensures backward compatibility by mapping between legacy and canonical scope names in relevant code paths.