PProf labels segment the code into groups for determing the source of
cpu/memory profiles. Since the web server and background jobs share a
lot of the same code (eg wsbuilder), it helps to know if the load is
user induced, or background job based.
- Adds/improves a lot of comments to make the autostop calculation code
clearer
- Changes the behavior of the enterprise template schedule store to
match the behavior of the workspace TTL endpoint when the new TTL is
zero
- Fixes a bug in the workspace TTL endpoint where it could unset the
build deadline, even though a max_deadline was specified
- Adds a new constraint to the workspace_builds table that enforces the
deadline is non-zero and below the max_deadline if it is set
- Adds CHECK constraint enum generation to scripts/dbgen, used for
testing the above constraint
- Adds Dean and Danielle as CODEOWNERS for the autostop calculation code
- Adds a query for counting managed agent workspace builds between two
timestamps
- The "Actual" field in the feature entitlement for managed agents is
now populated with the value read from the database
- The wsbuilder package now validates AI agent usage against the limit
when a license is installed
Closescoder/internal#777
Note that enforcement and checking usage will come in a future PR.
This feature is implemented differently than existing features in a few
ways.
It's highly recommended that reviewers read:
- This document which outlines the methods we could've used for license
enforcement:
https://www.notion.so/coderhq/AI-Agent-License-Enforcement-21ed579be59280c088b9c1dc5e364ee8
- Phase 0 of the actual RFC document:
https://www.notion.so/coderhq/Usage-based-Billing-AI-b-210d579be592800eb257de7eecd2d26d
### Multiple features in the license, a single feature in codersdk
Firstly, the feature is represented as a single feature in the codersdk
world, but is represented with multiple features in the license.
E.g. in the license you may have:
{
"features": {
"managed_agent_limit_soft": 100,
"managed_agent_limit_hard": 200
}
}
But the entitlements endpoint will return a single feature:
{
"features": {
"managed_agent_limit": {
"limit": 200,
"soft_limit": 100
}
}
}
This is required because of our rigid parsing that uses a
`map[string]int64` for features in the license. To avoid requiring all
customers to upgrade to use new licenses, the decision was made to just
use two features and merge them into one. Older Coder deployments will
parse this feature (from new licenses) as two separate features, but
it's not a problem because they don't get used anywhere obviously.
The reason we want to differentiate between a "soft" and "hard" limit is
so we can show admins how much of the usage is "included" vs. how much
they can use before they get hard cut-off.
### Usage period features will be compared and trump based on license
issuance time
The second major difference to other features is that "usage period"
features such as `managed_agent_limit` will now be primarily compared by
the `iat` (issued at) claim of the license they come from. This differs
from previous features. The reason this was done was so we could reduce
limits with newer licenses, which the current comparison code does not
allow for.
This effectively means if you have two active licenses:
- `iat`: 2025-07-14, `managed_agent_limit_soft`: 100,
`managed_agent_limit_hard`: 200
- `iat`: 2025-07-15, `managed_agent_limit_soft`: 50,
`managed_agent_limit_hard`: 100
Then the resulting `managed_agent_limit` entitlement will come from the
second license, even though the values are smaller than another valid
license. The existing comparison code would prefer the first license
even though it was issued earlier.
### Usage period features will count usage between the start and end
dates of the license
Existing limit features, like the user limit, just measure the current
usage value of the feature. The active user count is a gauge that goes
up and down, whereas agent usage can only be incremented, so it doesn't
make sense to use a continually incrementing counter forever and ever
for managed agents.
For managed agent limit, we count the usage between `nbf` (not before)
and `exp` (expires at) of the license that the entitlement comes from.
In the example above, we'd use the issued at date and expiry of the
second license as this date range.
This essentially means, when you get a new license, the usage resets to
zero.
The actual usage counting code will be implemented in a follow-up PR.
### Managed agent limit has a default entitlement value
Temporarily (until further notice), we will be providing licenses with
`feature_set` set to `premium` a default limit.
- Soft limit: `800 * user_limit`
- Hard limit: `1000 * user_limit`
"Enterprise" licenses do not get any default limit and are not entitled
to use the feature.
Unlicensed customers (e.g. OSS) will be permitted to use the feature as
much as they want without limits. This will be implemented when the
counting code is implemented in a follow-up PR.
Closes https://github.com/coder/internal/issues/760
This is the third PR for moving connection events out of the audit log.
This PR populates `count` on `ConnectionLogResponse` using a separate query, to preemptively mitigate the issue described in #17689. It's structurally identical to a portion of https://github.com/coder/coder/pull/18600, but for the connection log instead of the audit log.
Future PRs:
- Implement a table in the Web UI for viewing connection logs.
- Write a query to delete old events from the audit log, call it from dbpurge.
- Write documentation for the endpoint / feature
This is the second PR for moving connection events out of the audit log.
This PR:
- Adds the `/api/v2/connectionlog` endpoint
- Adds filtering for `GetAuthorizedConnectionLogsOffset` and thus the endpoint.
There's quite a few, but I was aiming for feature parity with the audit log.
1. `organization:<id|name>`
2. `workspace_owner:<username>`
3. `workspace_owner_email:<email>`
4. `type:<ssh|vscode|jetbrains|reconnecting_pty|workspace_app|port_forwarding>`
5. `username:<username>`
- Only includes web-based connection events (workspace apps, web port forwarding) as only those include user metadata.
6. `user_email:<email>`
7. `connected_after:<time>`
8. `connected_before:<time>`
9. `workspace_id:<id>`
10. `connection_id:<id>`
- If you have one snapshot of the connection log, and some sessions are ongoing in that snapshot, you could use this filter to check if they've been closed since.
11. `status:<connected|disconnected>`
- If `connected` only sessions with a null `close_time` are returned, if `disconnected`, only those with a non-null `close_time`. If filter is omitted, both are returned.
Future PRs:
- Populate `count` on `ConnectionLogResponse` using a seperate query (to preemptively mitigate the issue described in #17689)
- Implement a table in the Web UI for viewing connection logs.
- Write a query to delete old events from the audit log, call it from dbpurge.
- Write documentation for the endpoint / feature (including these filters)
### Breaking Change (changelog note):
> User connections to workspaces, and the opening of workspace apps or ports will no longer create entries in the audit log. Those events will now be included in the 'Connection Log'.
Please see the 'Connection Log' page in the dashboard, and the Connection Log [documentation](https://coder.com/docs/admin/monitoring/connection-logs) for details. Those with permission to view the Audit Log will also be able to view the Connection Log. The new Connection Log has the same licensing restrictions as the Audit Log, and requires a Premium Coder deployment.
### Context
This is the first PR of a few for moving connection events out of the audit log, and into a new database table and web UI page called the 'Connection Log'.
This PR:
- Creates the new table
- Adds and tests queries for inserting and reading, including reading with an RBAC filter.
- Implements the corresponding RBAC changes, such that anyone who can view the audit log can read from the table
- Implements, under the enterprise package, a `ConnectionLogger` abstraction to replace the `Auditor` abstraction for these logs. (No-op'd in AGPL, like the `Auditor`)
- Routes SSH connection and Workspace App events into the new `ConnectionLogger`
- Updates all existing tests to check the values of the `ConnectionLogger` instead of the `Auditor`.
Future PRs:
- Add filtering to the query
- Add an enterprise endpoint to query the new table
- Write a query to delete old events from the audit log, call it from dbpurge.
- Implement a table in the Web UI for viewing connection logs.
> [!NOTE]
> The PRs in this stack obviously won't be (completely) atomic. Whilst they'll each pass CI, the stack is designed to be merged all at once. I'm splitting them up for the sake of those reviewing, and so changes can be reviewed as early as possible. Despite this, it's really hard to make this PR any smaller than it already is. I'll be keeping it in draft until it's actually ready to merge.
## Description
This PR updates the lifecycle executor to explicitly exclude prebuilt
workspaces from being considered for lifecycle operations such as
`autostart`, `autostop`, `dormancy`, `default TTL` and `failure TTL`.
Prebuilt workspaces (i.e., those owned by the prebuild system user) are
handled separately by the prebuild reconciliation loop. Including them
in the lifecycle executor could lead to unintended behavior such as
incorrect scheduling or state transitions.
## Changes
* Updated the lifecycle executor query
`GetWorkspacesEligibleForTransition` to exclude workspaces with
`owner_id = 'c42fdf75-3097-471c-8c33-fb52454d81c0'` (prebuilds).
* Added tests to verify prebuilt workspaces are not considered in:
* Autostop
* Autostart
* Default TTL
* Dormancy
* Failure TTL
Fixes: https://github.com/coder/coder/issues/18740
Related to: https://github.com/coder/coder/issues/18658
- Add `format:"uri"` to `Group.AvatarURL` (matches `User.AvatarURL`
field)
- `<user_id>` and `<group_id>` were backwards in the `example:` tags
- The `@Success` annotation for `/acl [get]` had an incorrect type
This PR provides two commands:
* `coder prebuilds pause`
* `coder prebuilds resume`
These allow the suspension of all prebuilds activity, intended for use
if prebuilds are misbehaving.
Relates to https://github.com/coder/internal/issues/674
Currently, we send notifications to **all template admins** for **every
failed and hard-limited preset**. This can generate excessive
noise—especially when someone is debugging a template and creates
multiple broken versions in quick succession.
For now, we've decided to remove hard-limited preset notifications to
reduce excessive noise.
In the long term, we plan to aggregate failure information and deliver
it on a daily or weekly basis.
When in experimental this was used as an escape hatch. Removed to be
consistent with the template author's intentions
Backwards compatible, removing an experimental api field that is no longer used.
# What does this do?
This does parameter validation for dynamic parameters in `wsbuilder`. All input parameters are validated in `coder/coder` before being sent to terraform.
The heart of this PR is [`ResolveParameters`](https://github.com/coder/coder/blob/b65001e89c0577199a8e470c138c51e91cf2350c/coderd/dynamicparameters/resolver.go#L30-L30).
# What else changes?
`wsbuilder` now needs to load the terraform files into memory to succeed. This does add a larger memory requirement to workspace builds.
# Future work
- Sort autostart handling workspaces by template version id. So workspaces with the same template version only load the terraform files once from the db, and store them in the cache.
Currently, the prebuilds documentation states:
```
### Managing resource quotas
Prebuilt workspaces can be used in conjunction with [resource quotas](../../users/quotas.md).
Because unclaimed prebuilt workspaces are owned by the `prebuilds` user, you can:
1. Configure quotas for any group that includes this user.
1. Set appropriate limits to balance prebuilt workspace availability with resource constraints.
If a quota is exceeded, the prebuilt workspace will fail provisioning the same way other workspaces do.
```
If you need to have a separate quota for prebuilds as opposed to regular
users, you are required to create a separate group, as quotas are
applied to groups.
Currently it is not possible to create a separate 'prebuilds' group with
only the prebuilds user to add a quota. This PR skips the org membership
check specifically for the prebuilds user when patching a group.

Fixes https://github.com/coder/coder/issues/17840
NOTE: calling this out as a breaking change so that it is highly visible
in the changelog.
* CLI: Modifies `coder update` to stop the workspace if already running.
* UI: Modifies "update" button to always stop the workspace if already
running.
This PR extracts dynamic parameter rendering logic from
coderd/parameters.go into a new coderd/dynamicparameters package. Partly
for organization and maintainability, but primarily to be reused in
`wsbuilder` to be leveraged as validation.
## Description
This PR adds support for deleting prebuilt workspaces via the
authorization layer. It introduces special-case handling to ensure that
`prebuilt_workspace` permissions are evaluated when attempting to delete
a prebuilt workspace, falling back to the standard `workspace` resource
as needed.
Prebuilt workspaces are a subset of workspaces, identified by having
`owner_id` set to `PREBUILD_SYSTEM_USER`.
This means:
* A user with `prebuilt_workspace.delete` permission is allowed to
**delete only prebuilt workspaces**.
* A user with `workspace.delete` permission can **delete both normal and
prebuilt workspaces**.
⚠️ This implementation is scoped to **deletion operations only**. No
other operations are currently supported for the `prebuilt_workspace`
resource.
To delete a workspace, users must have the following permissions:
* `workspace.read`: to read the current workspace state
* `update`: to modify workspace metadata and related resources during
deletion (e.g., updating the `deleted` field in the database)
* `delete`: to perform the actual deletion of the workspace
## Changes
* Introduced `authorizeWorkspace()` helper to handle prebuilt workspace
authorization logic.
* Ensured both `prebuilt_workspace` and `workspace` permissions are
checked.
* Added comments to clarify the current behavior and limitations.
* Moved `SystemUserID` constant from the `prebuilds` package to the
`database` package `PrebuildsSystemUserID` to resolve an import cycle
(commit
https://github.com/coder/coder/pull/18333/commits/f24e4ab4b6f0a56726fd04be2d7302c9fdb52d53).
* Update middleware `ExtractOrganizationMember` to include system user
members.
Closes https://github.com/coder/internal/issues/312
Depends on https://github.com/coder/terraform-provider-coder/pull/408
This PR adds support for defining an **autoscaling block** for
prebuilds, allowing number of desired instances to scale dynamically
based on a schedule.
Example usage:
```
data "coder_workspace_preset" "us-nix" {
...
prebuilds = {
instances = 0 # default to 0 instances
scheduling = {
timezone = "UTC" # a single timezone is used for simplicity
# Scale to 3 instances during the work week
schedule {
cron = "* 8-18 * * 1-5" # from 8AM–6:59PM, Mon–Fri, UTC
instances = 3 # scale to 3 instances
}
# Scale to 1 instance on Saturdays for urgent support queries
schedule {
cron = "* 8-14 * * 6" # from 8AM–2:59PM, Sat, UTC
instances = 1 # scale to 1 instance
}
}
}
}
```
### Behavior
- Multiple `schedule` blocks per `prebuilds` block are supported.
- If the current time matches any defined autoscaling schedule, the
corresponding number of instances is used.
- If no schedule matches, the **default instance count**
(`prebuilds.instances`) is used as a fallback.
### Why
This feature allows prebuild instance capacity to adapt to predictable
usage patterns, such as:
- Scaling up during business hours or high-demand periods
- Reducing capacity during off-hours to save resources
### Cron specification
The cron specification is interpreted as a **continuous time range.**
For example, the expression:
```
* 9-18 * * 1-5
```
is intended to represent a continuous range from **09:00 to 18:59**,
Monday through Friday.
However, due to minor implementation imprecision, it is currently
interpreted as a range from **08:59:00 to 18:58:59**, Monday through
Friday.
This slight discrepancy arises because the evaluation is based on
whether a specific **point in time** falls within the range, using the
`github.com/coder/coder/v2/coderd/schedule/cron` library, which performs
per-minute matching rather than strict range evaluation.
---------
Co-authored-by: Danny Kopping <danny@coder.com>
Adds a custom marshaler to handle some cases where nils were being
marshaled to nulls, causing the web UI to throw an error.
---------
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
I modified the proxy host cache we already had and were using for
websocket csp headers to also include the wildcard app host, then used
those for frame-src policies.
I did not add frame-ancestors, since if I understand correctly, those
would go on the app, and this middleware does not come into play there.
Maybe we will want to add it on workspace apps like we do with cors, if
we find apps are setting it to `none` or something.
Closes https://github.com/coder/internal/issues/684
## Description
Adds tests for `ReconcileAll` to verify the full reconciliation flow
when handling expired prebuilds. This complements existing lower-level
tests by checking multiple reconciliation actions (delete + create) at
the higher reconciliation cycle level.
Related with comment:
https://github.com/coder/coder/pull/17996#issuecomment-2910516489
The file cache was caching the `Unauthorized` errors if a user without
the right perms opened the file first. So all future opens would fail.
Now the cache always opens with a subject that can read files. And authz
is checked on the Acquire per user.
```
// Report a metric only if the preset uses the latest version of the template and the template is not deleted.
// This avoids conflicts between metrics from old and new template versions.
//
// NOTE: Multiple versions of a preset can exist with the same orgName, templateName, and presetName,
// because templates can have multiple versions — or deleted templates can share the same name.
//
// The safest approach is to report the metric only for the latest version of the preset.
// When a new template version is released, the metric for the new preset should overwrite
// the old value in Prometheus.
//
// However, there’s one edge case: if an admin creates a template, it becomes hard-limited,
// then deletes the template and never creates another with the same name,
// the old preset will continue to be reported as hard-limited —
// even though it’s deleted. This will persist until `coderd` is restarted.
```