Closes [#1227](https://github.com/coder/internal/issues/1227)
Added support for license addons, starting with AI Governance, to enable
dynamic feature grouping without requiring license reissuance.
### What changed?
- Introduced a new `Addon` type to represent groupings of features that
can be added to licenses
- Created the first addon `AddonAIGovernance` which includes AI Bridge
and Boundary features
- Added validation for addon dependencies to ensure required features
are present
- Added new features: `FeatureBoundary` and
`FeatureAIGovernanceUserLimit`
- Updated license entitlement logic to handle addons and their features
- Added helper methods to check if features belong to addons
- Updated tests to verify addon functionality
### Why make this change?
This change introduces a more flexible licensing model that allows
features to be grouped into addons that can be added to licenses without
requiring reissuance when new features are added to an addon. This is
particularly useful for specialized feature sets like AI Governance,
where related features can be bundled together and sold as a separate
SKU. The addon approach allows for better organization of features and
more granular control over entitlements.
Replace the external moby/moby/pkg/namesgenerator dependency with an
internal implementation using gofakeit/v7. The moby package has ~25k
unique name combinations, and with its retry parameter only adds a
random digit 0-9, giving ~250k possibilities. In parallel tests, this
has led to collisions (flakes).
The new internal API at coderd/util/namesgenerator eliminates the
external dependnecy and offers functions with explicit uniqueness
guarantees. This PR also consolidates fragmented name generation in a
few places to use the new package.
| Old (moby/moby) | New |
|-------------------------------------|------------------------|
| namesgenerator.GetRandomName(0) | NameWith("_") |
| namesgenerator.GetRandomName(>0) | NameDigitWith("_") |
| testutil.GetRandomName(t) | UniqueName() |
| testutil.GetRandomNameHyphenated(t) | UniqueNameWith("-") |
namesgenerator package API:
- NameWith(delim): random name, not unique
- NameDigitWith(delim): random name with 1-9 suffix, not unique
- UniqueName(): guaranteed unique via atomic counter
- UniqueNameWith(delim): unique with custom delimiter
Names continue to be docker style `[adjective][delim][surname]`. Unique
names are truncated to 32 characters (preserving the numeric suffix) to
fit common name length limits in Coder.
Related test flakes:
https://github.com/coder/internal/issues/1212https://github.com/coder/internal/issues/118https://github.com/coder/internal/issues/1068
Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:
```
import (
"context"
"time"
"github.com/prometheus/client_golang/prometheus"
"golang.org/x/xerrors"
"gopkg.in/natefinch/lumberjack.v2"
"cdr.dev/slog/v3"
"github.com/coder/coder/v2/codersdk/agentsdk"
"github.com/coder/serpent"
)
```
3 groups: standard library, 3rd partly libs, Coder libs.
This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder).
It also updates dependencies that also use slog and were updated.
I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule.
Other dependencies, I pushed new tags.
Experiments passed to provisioners to determine behavior. This adds
`--experiments` flag to provisioner daemons. Prior to this, provisioners
had no method to turn on/off experiments.
## Description
When creating a prebuilt workspace, both `flags.IsPrebuild` and
`flags.IsFirstBuild` are true. Previously, the logic rejected cases with
multiple flags, so `coderd_workspace_creation_duration_seconds` wasn’t
updated for prebuilt creations. This is the only valid scenario where
two flags can be true.
## Changes
* Fix logic to update `coderd_workspace_creation_duration_seconds`
metric for prebuilt workspaces.
* Add prebuild helper functions to coderdenttest (other prebuild tests
can reuse this).
* Update workspace's provisionerdmetric tests to include this metric.
Follow-up: https://github.com/coder/coder/pull/19503
Related to: https://github.com/coder/coder/issues/19528
Not used in coderd yet, see stack.
Adds two new packages:
- `coderd/usage`: provides an interface for the "Collector" as well as a stub implementation for AGPL
- `enterprise/coderd/usage`: provides an interface for the "Publisher" as well as a Tallyman implementation
Relates to https://github.com/coder/internal/issues/814
Note that enforcement and checking usage will come in a future PR.
This feature is implemented differently than existing features in a few
ways.
It's highly recommended that reviewers read:
- This document which outlines the methods we could've used for license
enforcement:
https://www.notion.so/coderhq/AI-Agent-License-Enforcement-21ed579be59280c088b9c1dc5e364ee8
- Phase 0 of the actual RFC document:
https://www.notion.so/coderhq/Usage-based-Billing-AI-b-210d579be592800eb257de7eecd2d26d
### Multiple features in the license, a single feature in codersdk
Firstly, the feature is represented as a single feature in the codersdk
world, but is represented with multiple features in the license.
E.g. in the license you may have:
{
"features": {
"managed_agent_limit_soft": 100,
"managed_agent_limit_hard": 200
}
}
But the entitlements endpoint will return a single feature:
{
"features": {
"managed_agent_limit": {
"limit": 200,
"soft_limit": 100
}
}
}
This is required because of our rigid parsing that uses a
`map[string]int64` for features in the license. To avoid requiring all
customers to upgrade to use new licenses, the decision was made to just
use two features and merge them into one. Older Coder deployments will
parse this feature (from new licenses) as two separate features, but
it's not a problem because they don't get used anywhere obviously.
The reason we want to differentiate between a "soft" and "hard" limit is
so we can show admins how much of the usage is "included" vs. how much
they can use before they get hard cut-off.
### Usage period features will be compared and trump based on license
issuance time
The second major difference to other features is that "usage period"
features such as `managed_agent_limit` will now be primarily compared by
the `iat` (issued at) claim of the license they come from. This differs
from previous features. The reason this was done was so we could reduce
limits with newer licenses, which the current comparison code does not
allow for.
This effectively means if you have two active licenses:
- `iat`: 2025-07-14, `managed_agent_limit_soft`: 100,
`managed_agent_limit_hard`: 200
- `iat`: 2025-07-15, `managed_agent_limit_soft`: 50,
`managed_agent_limit_hard`: 100
Then the resulting `managed_agent_limit` entitlement will come from the
second license, even though the values are smaller than another valid
license. The existing comparison code would prefer the first license
even though it was issued earlier.
### Usage period features will count usage between the start and end
dates of the license
Existing limit features, like the user limit, just measure the current
usage value of the feature. The active user count is a gauge that goes
up and down, whereas agent usage can only be incremented, so it doesn't
make sense to use a continually incrementing counter forever and ever
for managed agents.
For managed agent limit, we count the usage between `nbf` (not before)
and `exp` (expires at) of the license that the entitlement comes from.
In the example above, we'd use the issued at date and expiry of the
second license as this date range.
This essentially means, when you get a new license, the usage resets to
zero.
The actual usage counting code will be implemented in a follow-up PR.
### Managed agent limit has a default entitlement value
Temporarily (until further notice), we will be providing licenses with
`feature_set` set to `premium` a default limit.
- Soft limit: `800 * user_limit`
- Hard limit: `1000 * user_limit`
"Enterprise" licenses do not get any default limit and are not entitled
to use the feature.
Unlicensed customers (e.g. OSS) will be permitted to use the feature as
much as they want without limits. This will be implemented when the
counting code is implemented in a follow-up PR.
Closes https://github.com/coder/internal/issues/760
This is the second PR for moving connection events out of the audit log.
This PR:
- Adds the `/api/v2/connectionlog` endpoint
- Adds filtering for `GetAuthorizedConnectionLogsOffset` and thus the endpoint.
There's quite a few, but I was aiming for feature parity with the audit log.
1. `organization:<id|name>`
2. `workspace_owner:<username>`
3. `workspace_owner_email:<email>`
4. `type:<ssh|vscode|jetbrains|reconnecting_pty|workspace_app|port_forwarding>`
5. `username:<username>`
- Only includes web-based connection events (workspace apps, web port forwarding) as only those include user metadata.
6. `user_email:<email>`
7. `connected_after:<time>`
8. `connected_before:<time>`
9. `workspace_id:<id>`
10. `connection_id:<id>`
- If you have one snapshot of the connection log, and some sessions are ongoing in that snapshot, you could use this filter to check if they've been closed since.
11. `status:<connected|disconnected>`
- If `connected` only sessions with a null `close_time` are returned, if `disconnected`, only those with a non-null `close_time`. If filter is omitted, both are returned.
Future PRs:
- Populate `count` on `ConnectionLogResponse` using a seperate query (to preemptively mitigate the issue described in #17689)
- Implement a table in the Web UI for viewing connection logs.
- Write a query to delete old events from the audit log, call it from dbpurge.
- Write documentation for the endpoint / feature (including these filters)
`ServeProvisionerDaemonRequest` has had an ID field for quite a while
now.
This field is only used for telemetry purposes; the actual daemon ID is
created upon insertion in the database. There's no reason to set it, and
it's confusing to do so. Deprecating the field and removing references
to it.
* chore(docs): update docs re workspace tag default values
* chore(coderdenttest): use random name instead of t.Name() in newExternalProvisionerDaemon
* fix(provisioner/terraform/tfparse): allow empty values in coder_workspace_tag defaults
Relates to https://github.com/coder/coder/issues/15894:
- Adds `coderdenttest.NewExternalProvisionerDaemonTerraform`
- Adds integration-style test coverage for creating a workspace with
`coder_workspace_tags` specified in `main.tf`
- Modifies `coderd/wsbuilder` to fetch template version variables and
includes them in eval context for evaluating `coder_workspace_tags`
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests. This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.
Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
* chore: move multi-org endpoints into enterprise directory
All multi-organization features are gated behind "premium" licenses. Enterprise licenses can no longer
access organization CRUD.
- Reworks the proxy registration loop into a struct (so I can add a `RegisterNow` method)
- Changes the proxy registration loop interval to 15s (previously 30s)
- Adds test which tests bidirectional DERP meshing on all possible paths between 6 workspace proxy replicas
Related to https://github.com/coder/customers/issues/438
- Adds a new query BatchUpdateLastUsedAt
- Adds calls to BatchUpdateLastUsedAt in app stats handler upon flush
- Passes a stats flush channel to apptest setup scaffolding and updates unit tests to assert modifications to LastUsedAt.
See also: https://github.com/coder/coder/pull/9522
- Adds commands `server dbcrypt {rotate,decrypt,delete}` to re-encrypt, decrypt, or delete encrypted data, respectively.
- Plumbs through dbcrypt in enterprise/coderd (including unit tests).
- Adds documentation in admin/encryption.md.
This enables dbcrypt by default, but the feature is soft-enforced on supplying external token encryption keys. Without specifying any keys, encryption/decryption is a no-op.
* chore: add /v2 to import module path
go mod requires semantic versioning with versions greater than 1.x
This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```
Migrate generated files to import /v2
* Fix gen
* chore: Implement workspace proxy going away
When a workspace proxy shuts down, the health status of that
proxy should immediately be updated. This is purely a courtesy
and technically not required
* feat: dbauthz always on, out of experimental
* Add ability to do rbac checks in unit tests
* Remove AuthorizeAllEndpoints
* Remove duplicate rbac checks