Commit Graph

1188 Commits

Author SHA1 Message Date
Jon Ayers a3792153de feat: add Prometheus collector for DERP server expvar metrics (#22583) (#22917)
backports the derp prometheus metrics
2026-03-10 12:29:15 -05:00
Rowan Smith 107fd97a61 fix: avoid derp-related panic during wsproxy registration (backport release/2.31) (#22526)
Backport of #22322.

- Cherry-picked 7f03bd7.

Co-authored-by: Dean Sheather <dean@deansheather.com>
2026-03-03 13:46:42 +05:00
Jon Ayers 4f34452bcc fix: use separate http.Transports for wsproxy tests (#22292)
- Previously all tests were sharing the global http.Transport meaning on
`Close` it would close connections presumed to be idle for other tests.
fixes https://github.com/coder/internal/issues/112
2026-02-24 23:56:58 -06:00
Garrett Delfosse 6c16794173 fix(cli): proactively use active template version when require_active_version is set (#22033)
Fixes #22030

## Problem

When a template has `require_active_version = true` and a workspace is
outdated, the web UI always shows "Update and start" as the **only**
button (for all users including admins), but `coder start` starts with
the old version. For admins, this silently succeeds on the stale
version. For non-admins, it goes through a clunky 403→retry path. This
also affects the VS Code extension, which calls `coder start --yes`
under the hood.

## Root Cause

`buildWorkspaceStartRequest()` in `cli/start.go` checks
`workspace.AutomaticUpdates == "always"` but ignores
`workspace.TemplateRequireActiveVersion`. The server-side autostart
already ORs both settings together:

```go
// coderd/autobuild/lifecycle_executor.go
func useActiveVersion(opts, ws) bool {
    return opts.RequireActiveVersion || ws.AutomaticUpdates == "always"
}
```

The CLI was missing the `RequireActiveVersion` check.

## Fix

Add `workspace.TemplateRequireActiveVersion` to the existing OR
condition:

```go
// Before:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || action == WorkspaceUpdate {

// After:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || workspace.TemplateRequireActiveVersion || action == WorkspaceUpdate {
```

Now `coder start` and `coder restart` proactively use the active
template version when `require_active_version` is set, matching the web
UI and server autostart behavior. The 403→retry fallback remains as a
safety net but is no longer the primary path for any user.

## Testing

Updated `enterprise/cli/start_test.go` — all user types (owner, template
admin, ACL admin, group ACL admin, member) now expect the active version
when `require_active_version` is set, and verify the 403→retry message
does NOT appear.
2026-02-24 19:51:48 -05:00
Zach 9613e41d21 chore: update boundary version (#22289)
Updating to the latest tag before the 2.31 code freeze.
2026-02-24 13:33:37 -05:00
Jon Ayers 0a7a3da178 fix: exclude provisioner_state from workspace_build_with_user view (#22159)
The provisioner state for a workspace build was being loaded for every
long-lived agent rpc connection. Since this state can be anywhere from
kilobytes to megabytes this can gradually cause the `coderd` memory
footprint to grow over time. It's also a lot of unnecessary allocations
for every query that fetches a workspace build since only a few callers
ever actually reference the provisioner state.

This PR removes it from the returned workspace build and adds a query to
fetch the provisioner state explicitly.
2026-02-23 22:46:17 -06:00
Sushant P 37a8e61ea2 chore: move Shared Workspaces from experiments to beta (#22206)
* Removed the shared-workspaces experiment and cleaned up related
middleware
* Added beta tagging to the UI for shared workspaces
2026-02-23 08:30:32 -08:00
Steven Masley b0f35316da chore!: automatically use secure cookies if using https access-url (#22198)
`--secure-auth-cookie` now automatically sources it's default value from `--access-url`

If the access url uses HTTPS, secure is set to `true`. 
To revert to old behavior, set the value explicitly to `false`
2026-02-20 10:33:37 -06:00
Steven Masley e5f64eb21d chore: optionally prefix authentication related cookies (#22148)
When the deployment option is enabled auth cookies are prefixed with
`__HOST-`
([info](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Set-Cookie)).

This is all done in a middleware that intercepts all requests and strips
the prefix on incoming request cookies.
2026-02-20 09:01:00 -06:00
Jake Howell d700f9ebc4 fix: restore block to Managed Agents on Enterprise (#22210)
#21998 accidentally allowed `Managed Agents` usages whilst being on an
`Enterprise` license. This was incorrect, it should work as the
following (same as prior to #21998).

| Scenario | Before your PRs | After your PRs (bug) | After this fix |
|---|---|---|---|
| Unlicensed (AGPL) | Permitted | Permitted | Permitted |
| Licensed, no entitlement | **Blocked** | Permitted | **Blocked** |
| Licensed, explicitly disabled (limit=0) | **Blocked** | Permitted |
**Blocked** |
| Licensed, entitled, under limit | Permitted | Permitted | Permitted |
| Licensed, entitled, over limit | Blocked | Permitted (advisory) |
Permitted (advisory) |
| Any license, stop/delete | Permitted | Permitted | Permitted |
| Any license, non-AI build | Permitted | Permitted | Permitted |
2026-02-20 20:15:32 +11:00
Jake Howell 051ed34580 feat: convert soft_limit to limit (#22048)
In relation to
[`internal#1281`](https://github.com/coder/internal/issues/1281)

Remove the `soft_limit` field from the `Feature` type and simplify
license limit handling. This change:

- Removes the `soft_limit` field from the API and SDK
- Uses the soft limit value as the single `limit` value in the UI and
API
- Simplifies warning logic to only show warnings when the limit is
exceeded
- Updates tests to reflect the new behavior
- Updates the UI to use the single limit value for display
2026-02-20 16:09:12 +11:00
Jake Howell 203899718f feat: remove agent workspaces limit (#21998)
In relation to
[`internal#1281`](https://github.com/coder/internal/issues/1281)

Managed agent workspace build limits are now advisory only. Breaching
the limit no longer blocks workspace creation — it only surfaces a
warning.

- Removed hard-limit enforcement in `checkAIBuildUsage` so AI task
builds are always permitted regardless of managed agent count.
- Updated the license warning to remove "Further managed agent builds
will be blocked." verbiage.
- Updated tests to assert builds succeed beyond the limit instead of
failing.
- Removed the "Limit" display from the `ManagedAgentsConsumption`
progress bar — the bar is now relative to the included allowance (soft
limit) only, and turns orange when usage exceeds it.

Bonus:

- De-MUI'd `LicenseBannerView` — replaced Emotion CSS and MUI `Link`
with Tailwind classes.
- Added `highlight-orange` color token to the Tailwind theme.
2026-02-20 12:56:00 +11:00
Danielle Maywood 92a6d6c2c0 chore: remove unnecessary loop variable captures (#22180)
Since Go 1.22, the loop variable capture issue is resolved. Variables
declared by for loops are now per-iteration rather than per-loop, making
the 'v := v' pattern unnecessary.
2026-02-19 09:02:19 +00:00
Paweł Banaszewski 90c11f3386 feat: add client column to aibridge_interceptions table (#21839)
Adds `client` column to `aibridge_interceptions` table. It is set accordingly to what is passed from AI Bridge in `RecordInterception`.
Adds interception filtering by `client` value.

Depends on: https://github.com/coder/aibridge/pull/158
Updates aibridge library to include this change.

Fixes: https://github.com/coder/aibridge/issues/31
2026-02-17 15:43:02 +01:00
Cian Johnston 4a3304fc38 feat(cli)!: expire tokens by default (#21783)
## Summary

> NOTE: Calling this out as a breaking change in case existing consumers
of the CLI depend on being able to see expired tokens OR being able to
delete tokens immediately.

Updates the `coder tokens rm` command to immediately expire a token by
ID, preserving the token record for audit trail purposes. Tokens can
still be deleted by passing `--delete`.

## Problem

During an incident on dev.coder.com, operators needed to urgently expire
an API key that was stuck in a hot loop. The only way to do this was via
direct database access:

```sql
UPDATE api_keys SET expires_at = NOW() WHERE id = '...';
```

This is not ideal for operators who may not have direct DB access or
want to avoid manual SQL.

## Solution

This PR adds:

- **API endpoint**: `PUT /api/v2/users/{user}/keys/{keyid}/expire` -
Sets the token's `expires_at` to now
- **SDK method**: `ExpireAPIKey(ctx, userID, keyID)` 
- **Updates CLI**: `coder tokens rm <name|id|token>` now _expires_ by
default. You can still delete by passing the `--delete` flag. The `coder
tokens list` command now also hides expired tokens by default. You can
`--include-expired` if needed to include them.
- **Audit logging**: The expire action is logged with old and new key
states

## Test plan

- Tests cover: owner expiring own token, admin expiring other user's
token, non-admin cannot expire other's token, 404 for non-existent token

Closes #21782

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-17 13:16:46 +00:00
Ethan 4b3889e4f9 fix(cli): allow site admins to use coder create --org for any organization (#21528)
## Problem

Site-wide admins (e.g., Owners) could not use `coder create --org <org>`
to create workspaces in organizations they are not members of. The error
was:

```
$ coder create my-workspace -t docker --org data-science
error: organization "data-science" not found, are you sure you are a member of this organization?
```

This was inconsistent with the web UI, where Owners can create
workspaces in any organization.

## Root Cause

The CLI's `OrganizationContext.Selected()` function only checked the
user's membership list, ignoring site-wide RBAC permissions that grant
Owners access to all organizations.

## Solution

Added a fallback in `OrganizationContext.Selected()` that fetches the
org directly via the API when not found in the membership list. This
works because the API endpoint applies RBAC filtering, allowing Owners
to read any org.

## Impact

This fixes `coder create --org` and all other CLI commands that use
`OrganizationContext.Selected()` (29+ commands), including:
- `coder templates push --org <any-org>`
- `coder organizations members add --org <any-org>`
- `coder provisioner list --org <any-org>`

## Testing

Added `TestEnterpriseCreate/OwnerCanCreateInNonMemberOrg` which:
- Creates an Owner user who is NOT a member of a second org
- Verifies they can create a workspace there using `--org`
- Properly fails without the code fix, passes with it

---

*This PR was generated by [mux](https://mux.coder.com) but reviewed by a
human.*
2026-02-16 12:16:08 +11:00
Steven Masley 01f06671a1 chore: return 404, not 400 if missing or authz deny (#22069) 2026-02-13 08:19:07 -06:00
Callum Styan 5f3be6b288 feat: add provisioner job queue wait time histogram and jobs enqueued counter (#21869)
This PR adds some metrics to help identify job enqueue rates and
latencies. This work was initiated as a way to help reduce the cost of
the observation/measurement itself for autostart scaletests, which
impacts our ability to identify/reason about the load caused by
autostart. See: https://github.com/coder/internal/issues/1209

I've extended the metrics here to account for regular user initiated
builds, prebuilds, autostarts, etc. IMO there is still the question here
of whether we want to include or need the `transition` label, which is
only present on workspace builds. Including it does lead to an increase
in cardinality, and in the case of the histogram (when not using native
histograms) that's at least a few extra series for every bucket. We
could remove the transition label there but keep it on the counter.

Additionally, the histogram is currently observing latencies for other
jobs, such as template builds/version imports, those do not have a
transition type associated with them.

Tested briefly in a workspace, can see metric values like the following:
-
`coderd_workspace_builds_enqueued_total{build_reason="autostart",provisioner_type="terraform",status="success",transition="start"}
1`
-
`coderd_provisioner_job_queue_wait_seconds_bucket{build_reason="autostart",job_type="workspace_build",provisioner_type="terraform",transition="start",le="0.025"}
1`

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 13:40:47 -08:00
Susana Ferreira 220b9f3cc5 fix: track goroutines and fix race condition in reconciler (#21980)
## Problem

CI failure showed 3 goroutines leaked in the prebuilds reconciler, all
stuck in `select` state:

1) `MetricsCollector.BackgroundFetch` (metrics goroutine)
2) `StoreReconciler.Run` (main reconciliation loop)
3) `StoreReconciler.Run.func3()` (provisioner job publisher goroutine)

All three goroutines were waiting for `ctx.Done()`, which likely means
`cancelFn()` was never called to trigger shutdown.

**Note:** I was unable to reproduce the flake locally. The likely cause
was a race condition between `Run()` and `Stop()` where `Stop()` could
check `running` (seeing `false`), return early, and then `Run()` would
start goroutines that never get cleaned up. This could happen in any
`coderd` test that starts a server with prebuilds enabled.

### Problems identified

1) Missing waitgoroup tracking: provisioner job publisher goroutine was
not tracked in the waitgroup, therefore, this goroutine was not tracked
for a clean shutdown in `Run defer func()`.
2) The provisioner job publisher goroutine had a redundant `case
<-c.done` that could race with `Stop()` select statement.
3) Race condition between `Run()` and `Stop()`: the `running` and
`stopped` fields were `atomic.Bool` values checked and set
independently, allowing a window where `Stop()` could see
`running=false` and return early, then `Run()` would set `running=true`
and start goroutines that would never be cleaned up. This could happen
in any `coderd` test that starts a server with prebuilds enabled.

## Changes

* Added `wg.Add(1)` and `defer wg.Done()` to track provisioner job
publisher goroutine in waitgroup
* Removed redundant `case <-c.done` from provisioner job publisher
goroutine to eliminate race condition
* Replaced `atomic.Bool` for `running` and `stopped` with a `sync.Mutex`
lifecycle state, also protecting `cancelFn` under the same mutex, to
eliminate the race between `Run()` and `Stop()`
* Added a guard in `Run()` to prevent double-start (`c.stopped ||
c.running`)
* Improved comments in Stop() and Run() to clarify shutdown behavior

Closes: https://github.com/coder/internal/issues/1116
2026-02-12 15:35:42 +00:00
Susana Ferreira 6338be3b30 chore: remove aibridgeproxyd README (#22027)
This README was unintentionally reintroduced during a Graphite stack
merge. It was removed in commit 910edbc2c6
on #21296, but upstack PR #21390 still had the old branch state with the
file, so it got merged back in. This PR removes the file.

The up-to-date documentation for AI Bridge Proxy can be found in
https://github.com/coder/coder/tree/main/docs/ai-coder/ai-bridge/ai-bridge-proxy
2026-02-10 10:45:26 +00:00
Zach fa2481c650 test: add synctest-based aibridged cache expiry test (#21984)
Resolves the TODO in TestPool by adding TestPool_Expiry which uses Go
1.25's testing/synctest to verify TTL-based cache eviction.

I wanted to get familiar with the new `synctest` package in Go 1.25 and
found this TODO comment, so I decided to take a stab at it 😄
2026-02-09 15:09:40 +02:00
Yevhenii Shcherbina 45e08aa9f6 chore: update boundary version (#21955)
Update boundary version to v0.8.0
2026-02-06 09:12:14 -05:00
Cian Johnston 25a0c807cb chore(coderd/database/dbfake): add support for provisioner job timestamp control (#21944)
Relates to https://github.com/coder/coder/pull/21922 /
https://github.com/coder/internal/issues/1259

* Adds `dbfake.BuilderOption func(*WorkspaceBuildBuilder)`
* Adds `BuilderOption` methods for setting various provisioner job
related fields on `WorkspaceBuildBuilder`.
* Migrates a number of existing tests that previously dependeded on
provisioner job timing to use these updated methods in the following
packages:
  * `coderd/jobreaper`
  * `coderd/notifications/reports`
  * `enterprise/coderd/schedule`
  * `enterprise/coderd/prebuilds`
  * `scripts/workspace-runtime-audit` 

🤖 Created using Mux (Opus 4.5)

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-06 09:44:40 +00:00
Steven Masley efd98bd93a chore: add template toggle to disable module caching (#21931)
There exists use cases to disable the new module caching behavior of
workspace builds. This was the legacy behavior.
2026-02-05 14:38:55 -06:00
Cian Johnston 91be688e39 chore(coderd/database): remove deprecated db2sdk.List(Lazy)? methods (#21902)
Removes deprecated methods db2sdk.List and db2sdk.ListLazy.
2026-02-03 17:52:07 +00:00
Jon Ayers 3c1db17361 fix: use existing transaction to claim prebuild (#21862)
- Claiming a prebuild was happening outside a transaction
2026-02-02 17:57:59 -06:00
Susana Ferreira 09453aa5a5 fix: support authentication for upstream proxy (#21841)
## Description

Adds authentication support for upstream proxies in `aibridgeproxyd`.
When credentials are provided in the upstream proxy URL, the
`Proxy-Authorization` header is now included in `CONNECT` requests.

## Changes

* Extract credentials from upstream proxy URL and set
`Proxy-Authorization` header on tunneled `CONNECT` requests
* Support optional user and password
* Fail at startup if both username and password are empty
* Add tests for all auth scenarios

Follow-up: https://github.com/coder/internal/issues/1204
2026-02-02 14:54:31 +00:00
Marcin Tojek ea1e8c083b chore: deprecate CODER_SSH_HOSTNAME_PREFIX in favor of CODER_WORKSPACE_HOSTNAME_SUFFIX (#21836)
## Description

Mark `--ssh-hostname-prefix` flag and `CODER_SSH_HOSTNAME_PREFIX` env
variable as deprecated, recommending users to use
`--workspace-hostname-suffix` / `CODER_WORKSPACE_HOSTNAME_SUFFIX`
instead for consistency with Coder Desktop.

The deprecated option is now hidden from help output and docs but
remains functional for backward compatibility. When used, it will show a
deprecation warning pointing to the recommended alternative.

## Changes

- Added `UseInstead` pointing to `workspace-hostname-suffix` option
(triggers deprecation warning)
- Set `Hidden: true` to hide from CLI help and documentation
- Updated description to mention deprecation
- Regenerated docs and help files via `make gen`

Closes #18156

---

_Originally requested by @matifali in
https://github.com/coder/coder/pull/18085#discussion_r2115594447_
2026-02-02 12:31:26 +01:00
Dean Sheather 6954b73f8a fix: prevent panic from duplicate metrics registration on license upload (#21832) 2026-02-02 20:57:06 +11:00
Ethan a464ab67c6 test: use explicit names in TestStartAutoUpdate to prevent flake (#21745)
The test was creating two template versions without explicit names,
relying on `namesgenerator.NameDigitWith()` which can produce
collisions. When both versions got the same random name, the test failed
with a 409 Conflict error.

Fix by giving each version an explicit name (`v1`, `v2`).

Closes https://github.com/coder/internal/issues/1309

---

*Generated by [mux](https://mux.coder.com)*
2026-01-30 13:24:06 +11:00
Susana Ferreira 9f6ce7542a feat: add metrics to aibridgeproxy (#21709)
## Description

Adds Prometheus metrics to the AI Bridge Proxy for observability into
proxy traffic and performance.

## Changes
* Add Metrics struct with the following metrics:
* `connect_sessions_total`: counts CONNECT sessions by type
(mitm/tunneled)
  * `mitm_requests_total`: counts MITM requests by provider
* `inflight_mitm_requests`: gauge tracking in-flight requests by
provider
* `mitm_request_duration_seconds`: histogram of request latencies by
provider
* `mitm_responses_total`: counts responses by status code class
(2XX/3XX/4XX/5XX) and provider
* Register metrics with `coder_aibridgeproxyd_` prefix in CLI
* Unregister metrics on server close to prevent registry leaks
* Add `tunneledMiddleware` to track non-allowlisted CONNECT sessions
* Add tests for metric recording in both MITM and tunneled paths

Closes: https://github.com/coder/internal/issues/1185
2026-01-29 15:11:36 +00:00
Marcin Tojek 04b0253e8a feat: add Prometheus metrics for license warnings and errors (#21749)
Fixes: coder/internal#767

Adds two new Prometheus metrics for license health monitoring:

- `coderd_license_warnings` - count of active license warnings
- `coderd_license_errors` - count of active license errors

Metrics endpoint after startup of a deployment with license enabled:

```
...
# HELP coderd_license_errors The number of active license errors.
# TYPE coderd_license_errors gauge
coderd_license_errors 0
...
# HELP coderd_license_warnings The number of active license warnings.
# TYPE coderd_license_warnings gauge
coderd_license_warnings 0
...
```
2026-01-29 13:50:15 +01:00
Spike Curtis 06e396188f test: subscribe to heartbeats synchronously on PGCoord startup (#21746)
fixes: https://github.com/coder/internal/issues/1304

Subscribe to heartbeats synchronously on startup of PGCoordinator. This ensures tests that send heartbeats don't race with this subscription.
2026-01-29 13:34:34 +04:00
Cian Johnston c2c225052a chore(enterprise/coderd): ensure TestManagedAgentLimit differentiates between tasks and workspaces (#21731)
My previous change to this test did not create another **workspace**
using the template containing `coder_ai_task` resources, meaning that
this test was not actually testing the right thing. This PR addresses
this oversight.
2026-01-28 16:30:56 +00:00
Susana Ferreira 327c885292 feat: add provider to aibridgeproxy requestContext (#21710)
## Description

Moves the provider lookup from `handleRequest` to `authMiddleware` so
that the provider is determined during the `CONNECT` handshake and
stored in the request context. This enables provider information to be
available earlier in the request lifecycle.

## Changes

* Move `aibridgeProviderFromHost` call from `handleRequest` to
`authMiddleware`
* Store `Provider` in `requestContext` during `CONNECT` handshake
* Add provider validation in `authMiddleware` (reject if no provider
mapping)
* Keep defensive provider check in `handleRequest` for safety

Follow-up from: https://github.com/coder/coder/pull/21617
2026-01-28 08:44:17 +00:00
Zach 2204731ddb feat: implement boundary usage tracker and telemetry collection (#21716)
Implements telemetry for boundary usage tracking across all Coder
replicas and reports them via telemetry.

Changes:
- Implement Tracker with Track(), FlushToDB(), and StartFlushLoop() methods
- Add telemetry integration via collectBoundaryUsageSummary()
- Use telemetry lock to ensure only one replica collects per period

The tracker accumulates unique workspaces, unique users, and request
counts (allowed/denied) in memory, then flushes to the database
periodically. During telemetry collection, stats are aggregated across
all replicas and reset for the next period.
2026-01-27 19:11:40 -07:00
Steven Masley 799b190dee fix: do not enforce managed agent limit for non-task workspaces (#21689)
Only task workspaces have the checks in wsbuilder for violating the
managed agent caps in the license.

Stopped tasks that are resumed with a regular workspace start **still
count as usage**.
2026-01-27 19:01:17 -06:00
Callum Styan d4cd982608 chore: undeprecate the workspace rename flag and clarify potential issues (#21669)
This undeprecates the `allow-workspace-renames` flag. IIUC, the 'danger'
with using this flag is that the workspace name might have been used in
the definition of some other terraform resources within template code,
so a rename could cause problems such as with persistent disks.

for https://github.com/coder/coder/issues/21628

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2026-01-27 10:53:13 -08:00
Susana Ferreira 8f3bb0b0d1 feat: add Copilot provider to aibridge (#21663)
Adds GitHub Copilot as a supported AI provider in aibridge. 

Depends on: https://github.com/coder/aibridge/pull/137
Closes: https://github.com/coder/internal/issues/1235
2026-01-27 14:02:35 +00:00
Cian Johnston 7b44976618 fix(coderd/provisionerdserver): correct managed agent tracking (#21696)
Relates to https://github.com/coder/internal/issues/1282

Updates tracking of managed agents to be predicated instead on the
presence of a related `task_id` instead of the presence of a
`coder_ai_task` resource.
2026-01-27 12:14:52 +00:00
Susana Ferreira c3f41ce08c fix: return proxy auth challenge on missing/invalid credentials (#21677)
## Description

When `CONNECT` requests are missing or have invalid
`Proxy-Authorization` credentials, the proxy now returns a proper `407
Proxy Authentication Required` response with a `Proxy-Authenticate`
challenge header instead of rejecting the connection without an HTTP
response.

Some clients (e.g. Copilot in VS Code) do not send the
`Proxy-Authorization` header on the initial request and rely on
receiving a `407 challenge` to prompt for credentials. Without this fix,
those clients would fail to connect.

## Changes

* Added `newProxyAuthRequiredResponse` helper function to create
consistent `407` responses with the appropriate `Proxy-Authenticate`
header.
* Updated `authMiddleware` to return a `407` challenge instead of
rejecting unauthenticated `CONNECT` requests without an HTTP response
* Refactored `handleRequest` to use the same helper for consistency
* Updated `TestProxy_Authentication` to verify the `407` response
status, `Proxy-Authenticate` header, and response body

Related to: https://github.com/coder/internal/issues/1235
2026-01-27 11:57:24 +00:00
Jake Howell 6f15b178a4 feat: extend premium license for aigovernance (#21499)
Closes [#1227](https://github.com/coder/internal/issues/1227)

Added support for license addons, starting with AI Governance, to enable
dynamic feature grouping without requiring license reissuance.

### What changed?

- Introduced a new `Addon` type to represent groupings of features that
can be added to licenses
- Created the first addon `AddonAIGovernance` which includes AI Bridge
and Boundary features
- Added validation for addon dependencies to ensure required features
are present
- Added new features: `FeatureBoundary` and
`FeatureAIGovernanceUserLimit`
- Updated license entitlement logic to handle addons and their features
- Added helper methods to check if features belong to addons
- Updated tests to verify addon functionality

### Why make this change?

This change introduces a more flexible licensing model that allows
features to be grouped into addons that can be added to licenses without
requiring reissuance when new features are added to an addon. This is
particularly useful for specialized feature sets like AI Governance,
where related features can be bundled together and sold as a separate
SKU. The addon approach allows for better organization of features and
more granular control over entitlements.
2026-01-27 22:33:53 +11:00
Susana Ferreira 7546e94534 feat: improve aibridgeproxyd logging (#21617)
## Description

Improves logging in `aibridgeproxyd` to provide better observability for
proxy requests. Adds structured logging with request correlation IDs and
propagates request context through the proxy chain.

## Changes

* Add `requestContext` struct to propagate metadata (token, provider,
session ID) through the proxy request/response chain
* ~Add `handleTunnelRequest` to log passthrough requests for
non-allowlisted domains at debug level~ (removed due to verbosity)
* Add `handleResponse` to log responses from `aibridged`
* Log MITM requests routed to `aibridged` at info level, tunneled
requests at debug level

Related to: https://github.com/coder/internal/issues/1185
2026-01-27 11:21:31 +00:00
Danny Kopping 303389e75a fix: correct https://github.com/coder/internal/issues/1167 behaviour (#21692)
Closes https://github.com/coder/internal/issues/1167

Previously we were checking that start != end time; this was flaking on
Windows.

On Windows, `time.Now()` has limited resolution (~1ms with Go runtime's
`timeBeginPeriod`, or ~15.6ms in default system resolution). When two
`time.Now()` calls execute within the same clock tick, they return
identical timestamps, causing `StartedAt.Before(EndedAt)` to return
`false`.
**References:**
- [Go issue #8687](https://github.com/golang/go/issues/8687) - Windows
system clock resolution issue
- [Go issue #67066](https://github.com/golang/go/issues/67066) -
time.Now precision on Windows (still open)

Instead, we're changing the assertion to (the more semantically correct)
"end not before start".

A possible future enhancement could be to plumb coder/quartz through the
recording mechanism, but it's unnecessary for now.

Signed-off-by: Danny Kopping <danny@coder.com>
2026-01-27 12:36:48 +02:00
Danny Kopping 7123518baa feat: conditionally send aibridge actor headers (#21643)
Also passes along the authenticated username as actor metadata.

Closes https://github.com/coder/aibridge/issues/135
Depends on https://github.com/coder/aibridge/pull/142

**Replace aibridge tag with merge commit once
https://github.com/coder/aibridge/pull/142 lands.**

---------

Signed-off-by: Danny Kopping <danny@coder.com>
2026-01-26 15:08:17 +00:00
Kacper Sawicki 78bc5861e0 feat(enterprise/coderd): add soft warning for AI Bridge GA transition (#21675)
## Summary

AI Bridge is moving to General Availability in v2.30 and will require
the AI Governance Add-On license in future versions. This adds a soft
warning for deployments using AI Bridge via Premium/Enterprise
FeatureSet without an explicit AI Bridge add-on license.

Relates to: https://github.com/coder/internal/issues/1226

## Changes

- Track whether AI Bridge was explicitly granted via license Features
(add-on) vs inherited from FeatureSet
- Show soft warning when AI Bridge is enabled and entitled via
FeatureSet but not via explicit add-on
- Changed AI Bridge enablement from hardcoded `true` to check
`CODER_AIBRIDGE_ENABLED` deployment config

## Behavior Change

AI Bridge is now only marked as "enabled" in entitlements when
`CODER_AIBRIDGE_ENABLED=true` is set in the deployment config.
Previously, it was always enabled for Premium/Enterprise licenses
regardless of the config setting.

This change ensures that users who do not use AI Bridge will not see the
soft warning about the upcoming license requirement.

## Warning Message

> AI Bridge is now Generally Available in v2.30. In a future Coder
version, your deployment will require the AI Governance Add-On to
continue using this feature. Please reach out to your account team or
sales@coder.com to learn more.

## Behavior

| Condition | Warning Shown |
|-----------|---------------|
| AI Bridge disabled |  No |
| AI Bridge enabled + explicit add-on license |  No |
| AI Bridge enabled + Premium/Enterprise FeatureSet (no add-on) |  Yes
|

## Screenshots

### 1. No license
<img width="1708" height="577" alt="image"
src="https://github.com/user-attachments/assets/cbdbfd4d-55de-4d70-8abf-2665f458e96f"
/>

### 2. No license + CODER_AIBRIDGE_ENABLED=true
<img width="1716" height="513" alt="image"
src="https://github.com/user-attachments/assets/344aae76-7703-485f-b568-1f13a1efa48f"
/>

### 3. Premium license + CODER_AIBRIDGE_ENABLED=false
<img width="1687" height="389" alt="image"
src="https://github.com/user-attachments/assets/c2be12b0-1c0f-438d-a293-f9ec9fe6a736"
/>

### 4. Premium license + CODER_AIBRIDGE_ENABLED=true
<img width="1707" height="525" alt="image"
src="https://github.com/user-attachments/assets/1a4640e1-e656-4f9b-bed0-9390cb5d6a84"
/>

## Notes

- TODO comments added to mark code that should be removed when AI Bridge
enforcement is added
- Feature continues to work - this is just a transitional warning (soft
enforcement)
2026-01-26 10:46:45 +01:00
Yevhenii Shcherbina 15c61906e2 test: fix flaky boundary test (#21660)
Closes https://github.com/coder/internal/issues/1297

Rewrite `TestBoundarySubcommand` in a way similar to
`TestPrebuildsCommand`.
2026-01-23 15:25:15 -05:00
Yevhenii Shcherbina 8d6822b23a ci: skip flaky test (#21658) 2026-01-23 18:46:31 +00:00
Yevhenii Shcherbina 9b14fd3adc feat: add boundary premium feature (#21589)
Source code changes:

- Added a wrapper for the boundary subcommand that checks feature
entitlement before executing the underlying command.
- Added a helper that returns the Boundary version using the
runtime/debug package, which reads this information from the go.mod
file.
- Added FeatureBoundary to the corresponding enum.
- Move boundary command from AGPL to enterprise.

`NOTE`: From now on, the Boundary version will be specified in go.mod
instead of being defined in AI modules.
2026-01-23 12:56:36 -05:00
Kacper Sawicki b82693d4cc feat(codersdk): revert "remove AI Bridge entitlement from Premium license" (#21653)
Reverts coder/coder#21540
2026-01-23 15:58:12 +00:00