## Description
Send a notification to the workspace owner when an AI task’s app state
becomes `Working` or `Idle`.
An AI task is identified by a workspace build with `HasAITask = true`
and `AITaskSidebarAppID` matching the agent app’s ID.
## Changes
* Add `TemplateTaskWorking` notification template.
* Add `TemplateTaskIdle` notification template.
* Add `GetLatestWorkspaceAppStatusesByAppID` SQL query to get the
workspace app statuses ordered by latest first.
* Update `PATCH /workspaceagents/me/app-status` to enqueue:
* `TemplateTaskWorking` when state transitions to `working`
* `TemplateTaskIdle` when state transitions to `idle`
* Notification labels include:
* `task`: task initial prompt
* `workspace`: workspace name
* Notification dedupe: include a minute-bucketed timestamp (UTC
truncated to the minute) in the enqueue data to allow identical content
to resend within the same day (but not more than once per minute).
Closes: https://github.com/coder/coder/issues/19776
# Canonicalize API Key Scopes
This PR introduces canonical API key scopes with a `coder:` namespace prefix to avoid collisions with low-level resource:action names. It:
1. Renames special API key scopes in the database:
- `all` → `coder:all`
- `application_connect` → `coder:application_connect`
2. Adds support for a new `scopes` field in the API key creation request, allowing multiple scopes to be specified while maintaining backward compatibility with the singular `scope` field.
3. Updates the API documentation to reflect these changes, including the new endpoint for listing public API key scopes.
4. Ensures backward compatibility by mapping between legacy and canonical scope names in relevant code paths.
## Description
Adds support for sending an ad‑hoc custom notification to the
authenticated user via API and CLI. This is useful for surfacing the
result of scripts or long‑running tasks. Notifications are delivered
through the configured method and the dashboard Inbox, respecting
existing preferences and delivery settings.
## Changes
* New notification template: “Custom Notification” with a label for a
custom title and a custom message.
* New API endpoint: `POST /api/v2/notifications/custom` to send a custom
notification to the requesting user.
* New API endpoint: `GET /notifications/templates/custom` to get custom
notification template.
* New CLI subcommand: `coder notifications custom <title> <message>` to
send a custom notification to the requesting user.
* Documentation updates: Add a “Custom notifications” section under
Administration > Monitoring > Notifications, including instructions on
sending custom notifications and examples of when to use them.
Closes: https://github.com/coder/coder/issues/19611
Closes https://github.com/coder/internal/issues/885
Adds a new database method GetProvisionerJobByIDWithLock that uses FOR
UPDATE without SKIP LOCKED to fix workspace build cancellation returning
500 errors when jobs are locked.
* provisionerdserver: Expires prebuild user token for workspace, if it
exists, when regenerating session token.
* dbauthz: disallow prebuilds user from creating api keys
* dbpurge: added functionality to expire stale api keys owned by the
prebuilds user
This PR should resolve https://github.com/coder/internal/issues/719 by
limiting the `workspace_builds` rows selected by the query to the most
recent 100 builds of a template, as opposed to all builds in the last
30d. For our own internal templates with the most builds (1700-2000 in a
30d period) this should cut the query execution time by about 80%.
Unless we have some restriction on keeping the 30d period, contract
related or otherwise, this seems like a safe change to make. In addition
to the execution speed improvements it also means the memory for the
query is bounded as well.
If we want to keep a 30d time period for the avg build time value I
think it's worth exploring a purpose built solution such as histogram
structures where the build times could be bucketized by template ID as
they're observed.
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
- Removes GetManagedAgentCount query
- Adds new table `usage_events_daily` which stores aggregated usage
events by the type and UTC day
- Adds trigger to update the values in this table when a new row is
inserted into `usage_events`
- Adds a migration that adds `usage_events_daily` rows for existing data
in `usage_events`
- Adds tests for the trigger
- Adds tests for the backfill query in the migration
Since the `usage_events` table is unreleased currently, this migration
will do nothing on real deployments and will only affect preview
deployments such as dogfood.
Closes https://github.com/coder/internal/issues/943
## Description
This PR introduces one counter and two histograms related to workspace
creation and claiming. The goal is to provide clearer observability into
how workspaces are created (regular vs prebuild) and the time cost of
those operations.
### `coderd_workspace_creation_total`
* Metric type: Counter
* Name: `coderd_workspace_creation_total`
* Labels: `organization_name`, `template_name`, `preset_name`
This counter tracks whether a regular workspace (not created from a
prebuild pool) was created using a preset or not.
Currently, we already expose `coderd_prebuilt_workspaces_claimed_total`
for claimed prebuilt workspaces, but we lack a comparable metric for
regular workspace creations. This metric fills that gap, making it
possible to compare regular creations against claims.
Implementation notes:
* Exposed as a `coderd_` metric, consistent with other workspace-related
metrics (e.g. `coderd_api_workspace_latest_build`:
https://github.com/coder/coder/blob/main/coderd/prometheusmetrics/prometheusmetrics.go#L149).
* Every `defaultRefreshRate` (1 minute ), DB query
`GetRegularWorkspaceCreateMetrics` is executed to fetch all regular
workspaces (not created from a prebuild pool).
* The counter is updated with the total from all time (not just since
metric introduction). This differs from the histograms below, which only
accumulate from their introduction forward.
### `coderd_workspace_creation_duration_seconds` &
`coderd_prebuilt_workspace_claim_duration_seconds`
* Metric types: Histogram
* Names:
* `coderd_workspace_creation_duration_seconds`
* Labels: `organization_name`, `template_name`, `preset_name`, `type`
(`regular`, `prebuild`)
* `coderd_prebuilt_workspace_claim_duration_seconds`
* Labels: `organization_name`, `template_name`, `preset_name`
We already have `coderd_provisionerd_workspace_build_timings_seconds`,
which tracks build run times for all workspace builds handled by the
provisioner daemon.
However, in the context of this issue, we are only interested in
creation and claim build times, not all transitions; additionally, this
metric does not include `preset_name`, and adding it there would
significantly increase cardinality. Therefore, separate more focused
metrics are introduced here:
* `coderd_workspace_creation_duration_seconds`: Build time to create a
workspace (either a regular workspace or the build into a prebuild pool,
for prebuild initial provisioning build).
* `coderd_prebuilt_workspace_claim_duration_seconds`: Time to claim a
prebuilt workspace from the pool.
The reason for two separate histograms is that:
* Creation (regular or prebuild): provisioning builds with similar time
magnitude, generally expected to take longer than a claim operation.
* Claim: expected to be a much faster provisioning build.
#### Native histogram usage
Provisioning times vary widely between projects. Using static buckets
risks unbalanced or poorly informative histograms.
To address this, these metrics use [Prometheus native
histograms](https://prometheus.io/docs/specs/native_histograms/):
* First introduced in Prometheus v2.40.0
* Recommended stable usage from v2.45+
* Requires Go client `prometheus/client_golang` v1.15.0+
* Experimental and must be explicitly enabled on the server
(`--enable-feature=native-histograms`)
For compatibility, we also retain a classic bucket definition (aligned
with the existing provisioner metric:
https://github.com/coder/coder/blob/main/provisionerd/provisionerd.go#L182-L189).
* If native histograms are enabled, Prometheus ingests the
high-resolution histogram.
* If not, it falls back to the predefined buckets.
Implementation notes:
* Unlike the counter, these histograms are updated in real-time at
workspace build job completion.
* They reflect data only from the point of introduction forward (no
historical backfill).
## Relates to
Closes: https://github.com/coder/coder/issues/19528
Native histograms tested in observability stack:
https://github.com/coder/observability/pull/50
closes https://github.com/coder/coder/issues/18274
This pull request makes system users visible in various group related
queries so that they can be added to and removed from groups. This
allows system user quotas to be configured. System users are still
ignored in certain queries, such as when license seat consumption is
determined.
This pull request further ensures the existence of a
"coder_prebuilt_workspaces" group in any organization that needs
prebuilt workspaces
---------
Co-authored-by: Susana Ferreira <susana@coder.com>
Closes https://github.com/coder/coder/issues/18356.
This change finds and selects a matching preset if one was not chosen
during workspace creation. This solidifies the relationship between
presets and parameters.
When a workspace is created without in explicitly chosen preset, it will
now still be eligible to claim a prebuilt workspace if one is available.
This pull request introduces support for external workspace management, allowing users to register and manage workspaces that are provisioned and managed outside of the Coder.
* Added has_external_agent field to workspace builds and template versions
Not used in coderd yet, see stack.
Adds two new packages:
- `coderd/usage`: provides an interface for the "Collector" as well as a stub implementation for AGPL
- `enterprise/coderd/usage`: provides an interface for the "Publisher" as well as a Tallyman implementation
Relates to https://github.com/coder/internal/issues/814
Instead of creating tasks with a specialized call to `CreateWorkspace`
on the frontend, we instead lift this to the backend and allow the
frontend to simply call `CreateAITask`.
This change removes the `GetLatestWorkspaceBuilds` query which includes
all workspaces for all time (including deleted). This allows us to also
stop using `GetProvisionerJobsByIDs` for said builds as the job status
is included in `GetWorkspaces` called separately.
**BREAKING CHANGE**: The `coderd_api_workspace_latest_build` Prometheus
metric no longer includes builds belonging to deleted workspaces, as
such, this metric will show fewer statuses.
Fixescoder/internal#717
closes https://github.com/coder/coder/issues/18274
This pull request makes system users visible in various group related
queries so that they can be added to and removed from groups. This
allows system user quotas to be configured. System users are still
ignored in certain queries, such as when license seat consumption is
determined.
This pull request further ensures the existence of a
"coder_prebuilt_workspaces" group in any organization that needs
prebuilt workspaces
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Organization and group member listings now include system users.
* **Bug Fixes**
* Updated tests to reflect the inclusion of system users in member and
group queries.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Closes https://github.com/coder/internal/issues/780
## Summary of changes:
- added `user_secrets` table
- `user_secrets` table contains `env_name` and `file_path` fields which
are not used at the moment, but will be used in later PRs
- `user_secrets` table doesn't contain `value_key_id`, I will add it in
a separate migration in a dbcrypt PR
- on one hand I don't want to add fields which are not used (because
it's a risk smth may change in implementation later), on the other hand
I don't want to add too many migrations for user secrets table
- added unique sql indexes
- added sql queries for CRUD operations on user-secrets
- introduced new `ResourceUserSecret` resource
- basic unit-tests for CRUD ops and authorization behavior
- Role updates:
- owner:
- remove `ResourceUserSecret` from site-wide perms
- add `ResourceUserSecret` to user-wide perms
- orgAdmin
- remove `ResourceUserSecret` from org-wide perms; seems it's not
strictly required, because `ResourceUserSecret` is not tied to
organization in dbauthz wrappers?
- memberRole
- no need to change memberRole because it implicitly has access to
user-secrets thanks to the `allPermsExcept`
- is it enough changes to roles?
Main questions:
- [ ] We will have 2 migrations for user-secrets:
- initial migration (in current PR)
- adding `value_key_id` in dbcrypt PR
- is this approach reasonable?
- [ ] Are changes to roles's permissions are correct?
- [ ] Are changes in roles_test.go are correct?
---------
Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>
This PR sets a constraint of 1MB on the provisioner job logs written to
the database. This is consistent with the constraint we place on
workspace agent logs:
https://github.com/coder/coder/blob/4ac6be6d835dc36c242e35a26b584b784040bf28/coderd/database/dump.sql#L2030
It also adds a message printed to the front end about the provisioner
log overflow, and updates the message printed to the front end when
workspace startup logs exceed the max, as it was causing some customers
to think their startup script had failed to run.
Solves https://github.com/coder/coder/issues/15096
This is a slight rework/refactor of the earlier PRs from @dannykopping
and @Emyrk:
- https://github.com/coder/coder/pull/15669
- https://github.com/coder/coder/pull/15684
- https://github.com/coder/coder/pull/17596
Rather than having a per-app CORS behaviour setting and additionally a
template level setting for ports, this PR adds a single template level
CORS behaviour setting that is then used by all apps/ports for
workspaces created from that template.
The main changes are in `proxy.go` and `request.go` to:
a) get the CORS behaviour setting from the template
b) have `HandleSubdomain` bypass the CORS middleware handler if the
selected behaviour is `passthru`
c) in `proxyWorkspaceApp`, do not modify the response if the selected
behaviour is `passthru`
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added support for configuring CORS behavior ("simple" or "passthru")
at the template level for all shared ports.
* Introduced a new "CORS Behavior" setting in the template creation and
settings forms.
* API endpoints and responses now include the optional `cors_behavior`
property for templates.
* Workspace apps and proxy now honor the specified CORS behavior,
enabling conditional CORS middleware application.
* Enhanced workspace app tests with comprehensive scenarios covering
CORS behaviors and authentication states.
* **Bug Fixes**
* None.
* **Documentation**
* Updated API and admin documentation to describe the new
`cors_behavior` property and its usage.
* Added examples and schema references for CORS behavior in relevant API
docs.
* **Tests**
* Extended automated tests to cover different CORS behavior scenarios
for templates and workspace apps.
* **Chores**
* Updated audit logging to track changes to the `cors_behavior` field on
templates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
- Adds a query for counting managed agent workspace builds between two
timestamps
- The "Actual" field in the feature entitlement for managed agents is
now populated with the value read from the database
- The wsbuilder package now validates AI agent usage against the limit
when a license is installed
Closescoder/internal#777
### Breaking change (changelog note):
>With new connection events appearing in the Connection Log, connection events older than 90 days will now be deleted from the Audit Log. If you require this legacy data, we recommend querying it from the REST API or making a backup of the database/these events before upgrading your Coder deployment. Please see the PR for details on what exactly will be deleted.
Of note is that there are currently no plans to delete connection events from the Connection Log.
### Context
This is the fifth PR for moving connection events out of the audit log.
In previous PRs:
- **New** connection logs have been routed to the `connection_logs` table. They will *not* appear in the audit log.
- These new connection logs are served from the new `/api/v2/connectionlog` endpoint.
In this PR:
- We'll now clean existing connection events out of the audit log, if they are older than 90 days, We do this in batches of 1000, every 10 minutes.
The criteria for deletion is simple:
```
WHERE
(
action = 'connect'
OR action = 'disconnect'
OR action = 'open'
OR action = 'close'
)
AND "time" < @before_time::timestamp with time zone
```
where `@before_time` is currently configured to 90 days in the past.
Future PRs:
- Write documentation for the endpoint / feature
This is the third PR for moving connection events out of the audit log.
This PR populates `count` on `ConnectionLogResponse` using a separate query, to preemptively mitigate the issue described in #17689. It's structurally identical to a portion of https://github.com/coder/coder/pull/18600, but for the connection log instead of the audit log.
Future PRs:
- Implement a table in the Web UI for viewing connection logs.
- Write a query to delete old events from the audit log, call it from dbpurge.
- Write documentation for the endpoint / feature
### Breaking Change (changelog note):
> User connections to workspaces, and the opening of workspace apps or ports will no longer create entries in the audit log. Those events will now be included in the 'Connection Log'.
Please see the 'Connection Log' page in the dashboard, and the Connection Log [documentation](https://coder.com/docs/admin/monitoring/connection-logs) for details. Those with permission to view the Audit Log will also be able to view the Connection Log. The new Connection Log has the same licensing restrictions as the Audit Log, and requires a Premium Coder deployment.
### Context
This is the first PR of a few for moving connection events out of the audit log, and into a new database table and web UI page called the 'Connection Log'.
This PR:
- Creates the new table
- Adds and tests queries for inserting and reading, including reading with an RBAC filter.
- Implements the corresponding RBAC changes, such that anyone who can view the audit log can read from the table
- Implements, under the enterprise package, a `ConnectionLogger` abstraction to replace the `Auditor` abstraction for these logs. (No-op'd in AGPL, like the `Auditor`)
- Routes SSH connection and Workspace App events into the new `ConnectionLogger`
- Updates all existing tests to check the values of the `ConnectionLogger` instead of the `Auditor`.
Future PRs:
- Add filtering to the query
- Add an enterprise endpoint to query the new table
- Write a query to delete old events from the audit log, call it from dbpurge.
- Implement a table in the Web UI for viewing connection logs.
> [!NOTE]
> The PRs in this stack obviously won't be (completely) atomic. Whilst they'll each pass CI, the stack is designed to be merged all at once. I'm splitting them up for the sake of those reviewing, and so changes can be reviewed as early as possible. Despite this, it's really hard to make this PR any smaller than it already is. I'll be keeping it in draft until it's actually ready to merge.
Closes#17791
This PR adds ability to cancel workspace builds that are in "pending"
status.
Breaking changes:
- CancelWorkspaceBuild method in codersdk now accepts an optional
request parameter
API:
- Added `expect_status` query parameter to the cancel workspace build
endpoint
- This parameter ensures the job hasn't changed state before canceling
- API returns `412 Precondition Failed` if the job is not in the
expected status
- Valid values: `running` or `pending`
- Wrapped the entire cancel method in a database transaction
UI:
- Added confirmation dialog to the `Cancel` button, since it's a
destructive operation


- Enabled cancel action for pending workspaces (`expect_status=pending`
is sent if workspace is in pending status)

---------
Co-authored-by: Dean Sheather <dean@deansheather.com>
# Implement OAuth2 Dynamic Client Registration (RFC 7591/7592)
This PR implements OAuth2 Dynamic Client Registration according to RFC 7591 and Client Configuration Management according to RFC 7592. These standards allow OAuth2 clients to register themselves programmatically with Coder as an authorization server.
Key changes include:
1. Added database schema extensions to support RFC 7591/7592 fields in the `oauth2_provider_apps` table
2. Implemented `/oauth2/register` endpoint for dynamic client registration (RFC 7591)
3. Added client configuration management endpoints (RFC 7592):
- GET/PUT/DELETE `/oauth2/clients/{client_id}`
- Registration access token validation middleware
4. Added comprehensive validation for OAuth2 client metadata:
- URI validation with support for custom schemes for native apps
- Grant type and response type validation
- Token endpoint authentication method validation
5. Enhanced developer documentation with:
- RFC compliance guidelines
- Testing best practices to avoid race conditions
- Systematic debugging approaches for OAuth2 implementations
The implementation follows security best practices from the RFCs, including proper token handling, secure defaults, and appropriate error responses. This enables third-party applications to integrate with Coder's OAuth2 provider capabilities programmatically.