Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:
```
import (
"context"
"time"
"github.com/prometheus/client_golang/prometheus"
"golang.org/x/xerrors"
"gopkg.in/natefinch/lumberjack.v2"
"cdr.dev/slog/v3"
"github.com/coder/coder/v2/codersdk/agentsdk"
"github.com/coder/serpent"
)
```
3 groups: standard library, 3rd partly libs, Coder libs.
This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder).
It also updates dependencies that also use slog and were updated.
I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule.
Other dependencies, I pushed new tags.
Because this affects more than just the template insights
page (specifically it also affects the deployment stats endpoint which
is shown on bottom bar and Prometheus), the group is being renamed
generically to just "stats collection". In the future if we need to
affect the other stats we can put those options here.
Then, because this change only affects a portion of stats, specifically
usage stats like connection and application time, bytes sent, etc, add a
new sub-group called "usage stats".
Then finally add back the "enable" flag. This also gives us a place to
one day place an "anonymize" flag if we need to go that route.
**Breaking Change:** Existing oauth apps might now use PKCE. If an
unknown IdP type was being used, and it does not support PKCE, it will
break.
To fix, set the PKCE methods on the external auth to `none`
```
export CODER_EXTERNAL_AUTH_1_PKCE_METHODS=none
```
Closes#20399
To summarize the original commit messages:
- Do not log stats to the database.
- Return errors on the insight endpoints.
- Update the frontend to show those errors.
- Also fixes an issue with getting the user status count via codersdk,
since I added a test to ensure it was not disabled by this flag and it
was sending the wrong payload.
Adds `--disable-workspace-sharing` option.
Workspace sharing is disabled by not including user and group ACLs in
the workspace RBAC object, which prevents ACL-based authz.
Closes https://github.com/coder/internal/issues/1072
The commit also adds saving of workspace user/group ACLs in the test DB
data generator.
## Problem
Users may not realize that task notifications are disabled by default.
To improve awareness, we show a warning alert on the Tasks page when all
task notifications are disabled.
**Alert visibility logic:**
- Shows when **all** task notification templates (Task Working, Task
Idle, Task Completed, Task Failed) are disabled
- Can be dismissed by the user, which stores the dismissal in the user
preferences API
- If the user later enables any task notification in Account Settings,
the dismissal state is cleared so the alert will show again if they
disable all notifications in the future
<img width="2980" height="1588" alt="Screenshot 2025-11-25 at 17 48 17"
src="https://github.com/user-attachments/assets/316bf097-d9d2-4489-bc16-2987ba45f45c"
/>
## Changes
- Added a warning alert to the Tasks page when all task notifications
are disabled
- Introduced new `/users/{user}/preferences` endpoint to manage user
preferences (stored in `user_configs` table)
- Alert is dismissible and stores the dismissal state via the new user
preferences API endpoint
- Enabling any task notification in Account Settings clears the
dismissal state via the preferences API
- Added comprehensive Storybook stories for both TasksPage and
NotificationsPage to test all alert visibility states and interactions
Closes: https://github.com/coder/internal/issues/1089
This is to debug context timeouts on API requests to the agent.
Because rbac and database cannot be imported in slim, split the logger
middleware into slim and non-slim versions and break out the recovery
middleware.
This PR adds the backend implementation for modifying task prompts. Part
of https://github.com/coder/internal/issues/1084
## Changes
- New `UpdateTaskPrompt` database query to update task prompts
- New PATCH `/api/v2/tasks/{task}/prompt` endpoint
## Notes
This is part 1 of a 2-part PR stack. The frontend UI will be added in a
follow-up PR based on this branch
(https://github.com/coder/coder/pull/20812).
---
🤖 PR was written by Claude Sonnet 4.5 Thinking using [Coder
Mux](https://github.com/coder/cmux) and reviewed by a human 👩
Fixes: #20744
Upsert audit and connection log entries with a context derived from the API context, rather than the individual request so that we don't error out if the request is canceled or the client hangs up (e.g. if we return an error).
Experiments passed to provisioners to determine behavior. This adds
`--experiments` flag to provisioner daemons. Prior to this, provisioners
had no method to turn on/off experiments.
The authz recorder is causing a lot of memory to be allocated, and is a
memory leak for websocket connections.
This change makes it opt-in on a per request basis (ontop of `isDev`).
To get the authz headers, use `Copy as cURL` on chrome and append the
header `x-authz-checks=true`.
Adds the following debug routes for people with the `debug_info:read`
permission:
- `/api/v2/debug/pprof` for `net/http/pprof`
- `/`
- `/cmdline`
- `/profile`
- `/symbol`
- `/trace`
- `/*`
- `/api/v2/debug/metrics` for Prometheus metrics
# Add support for low-level API key scopes
This PR adds support for fine-grained API key scopes based on RBAC resource:action pairs. It includes:
1. A new endpoint `/api/v2/auth/scopes` to list all public low-level API key scopes
2. Generated constants in the SDK for all public scopes
3. Tests to verify scope validation during token creation
4. Updated API documentation to reflect the expanded scope options
The implementation allows users to create API keys with specific permissions like `workspace:read` or `template:use` instead of only the legacy `all` or `application_connect` scopes.
Fixes#19847
There is currently an issue with subdomain workspace apps on workspace
proxies, where if you have a workspace proxy wildcard nested beneath the
primary wildcard, cookies from the primary may be sent to the server
before cookies from the proxy specifically.
Currently:
1. Use a subdomain app via the primary proxy `*.coder.corp.com`
a. Client sends no cookies
a. Server does token smuggling flow
a. Server sets a cookie `coder_subdomain_app_session_token` on
`*.coder.corp.com`
a. Server redirects client to reload the page
a. Request should succeed as usual
1. Wait until the primary proxy's session token cookie has expired in
the database (or make it invalid yourself)
1. Use a subdomain app via a separate proxy `*.sydney.coder.corp.com`
a. Client sends `coder_subdomain_app_session_token` cookie from
`*.coder.corp.com`
a. Server validates supplied cookie, it fails because it's expired
a. Server does token smuggling flow
a. Server sets a cookie `coder_subdomain_app_session_token` on
`*.sydney.coder.corp.com`
a. Server redirects client to reload page
a. Client sends BOTH cookies.
a. The server will only process the first cookie it receives, so if the
expired cookie for the primary proxy is sent first the request will end
up in a permanent loop on step b.
The fix is to append `_{hash(wildcard_access_url)}` to the subdomain
cookies as we cannot control browser behavior further. This avoids the
conflict as each proxy will only read it's specific cookie.
## Description
Adds support for sending an ad‑hoc custom notification to the
authenticated user via API and CLI. This is useful for surfacing the
result of scripts or long‑running tasks. Notifications are delivered
through the configured method and the dashboard Inbox, respecting
existing preferences and delivery settings.
## Changes
* New notification template: “Custom Notification” with a label for a
custom title and a custom message.
* New API endpoint: `POST /api/v2/notifications/custom` to send a custom
notification to the requesting user.
* New API endpoint: `GET /notifications/templates/custom` to get custom
notification template.
* New CLI subcommand: `coder notifications custom <title> <message>` to
send a custom notification to the requesting user.
* Documentation updates: Add a “Custom notifications” section under
Administration > Monitoring > Notifications, including instructions on
sending custom notifications and examples of when to use them.
Closes: https://github.com/coder/coder/issues/19611
Update OAuth2 metadata endpoint routes to support path suffixes
This PR updates the OAuth2 metadata endpoint routes to include a wildcard character (*) at the end of the paths. This change allows the endpoints to match requests with path suffixes, making our OAuth2 discovery implementation more flexible and compliant with the relevant RFCs.
The updated routes are:
- `/.well-known/oauth-authorization-server*` for RFC 8414 discovery
- `/.well-known/oauth-protected-resource*` for RFC 9728 discovery
Closes https://github.com/coder/internal/issues/961
Likely the same deal as in #19599, the body of `require.Eventually` now fires immediately, when it used to fire after 250ms (the interval). Presumably, the deployment stats become ready before the vs code session count gets incremented. This was never an issue with the 250ms delay, as this flake has only cropped up after the testify version bump.
We'll fix the issue by making it possible to wait for a full metrics cache refresh, i.e. removing `require.Eventually` in this test altogether.
## Description
This PR introduces one counter and two histograms related to workspace
creation and claiming. The goal is to provide clearer observability into
how workspaces are created (regular vs prebuild) and the time cost of
those operations.
### `coderd_workspace_creation_total`
* Metric type: Counter
* Name: `coderd_workspace_creation_total`
* Labels: `organization_name`, `template_name`, `preset_name`
This counter tracks whether a regular workspace (not created from a
prebuild pool) was created using a preset or not.
Currently, we already expose `coderd_prebuilt_workspaces_claimed_total`
for claimed prebuilt workspaces, but we lack a comparable metric for
regular workspace creations. This metric fills that gap, making it
possible to compare regular creations against claims.
Implementation notes:
* Exposed as a `coderd_` metric, consistent with other workspace-related
metrics (e.g. `coderd_api_workspace_latest_build`:
https://github.com/coder/coder/blob/main/coderd/prometheusmetrics/prometheusmetrics.go#L149).
* Every `defaultRefreshRate` (1 minute ), DB query
`GetRegularWorkspaceCreateMetrics` is executed to fetch all regular
workspaces (not created from a prebuild pool).
* The counter is updated with the total from all time (not just since
metric introduction). This differs from the histograms below, which only
accumulate from their introduction forward.
### `coderd_workspace_creation_duration_seconds` &
`coderd_prebuilt_workspace_claim_duration_seconds`
* Metric types: Histogram
* Names:
* `coderd_workspace_creation_duration_seconds`
* Labels: `organization_name`, `template_name`, `preset_name`, `type`
(`regular`, `prebuild`)
* `coderd_prebuilt_workspace_claim_duration_seconds`
* Labels: `organization_name`, `template_name`, `preset_name`
We already have `coderd_provisionerd_workspace_build_timings_seconds`,
which tracks build run times for all workspace builds handled by the
provisioner daemon.
However, in the context of this issue, we are only interested in
creation and claim build times, not all transitions; additionally, this
metric does not include `preset_name`, and adding it there would
significantly increase cardinality. Therefore, separate more focused
metrics are introduced here:
* `coderd_workspace_creation_duration_seconds`: Build time to create a
workspace (either a regular workspace or the build into a prebuild pool,
for prebuild initial provisioning build).
* `coderd_prebuilt_workspace_claim_duration_seconds`: Time to claim a
prebuilt workspace from the pool.
The reason for two separate histograms is that:
* Creation (regular or prebuild): provisioning builds with similar time
magnitude, generally expected to take longer than a claim operation.
* Claim: expected to be a much faster provisioning build.
#### Native histogram usage
Provisioning times vary widely between projects. Using static buckets
risks unbalanced or poorly informative histograms.
To address this, these metrics use [Prometheus native
histograms](https://prometheus.io/docs/specs/native_histograms/):
* First introduced in Prometheus v2.40.0
* Recommended stable usage from v2.45+
* Requires Go client `prometheus/client_golang` v1.15.0+
* Experimental and must be explicitly enabled on the server
(`--enable-feature=native-histograms`)
For compatibility, we also retain a classic bucket definition (aligned
with the existing provisioner metric:
https://github.com/coder/coder/blob/main/provisionerd/provisionerd.go#L182-L189).
* If native histograms are enabled, Prometheus ingests the
high-resolution histogram.
* If not, it falls back to the predefined buckets.
Implementation notes:
* Unlike the counter, these histograms are updated in real-time at
workspace build job completion.
* They reflect data only from the point of introduction forward (no
historical backfill).
## Relates to
Closes: https://github.com/coder/coder/issues/19528
Native histograms tested in observability stack:
https://github.com/coder/observability/pull/50
Fixescoder/internal#892Fixescoder/internal#896
Example output:
```
❯ coder exp task list
ID NAME STATUS STATE STATE CHANGED MESSAGE
a7a27450-ca16-4553-a6c5-9d6f04808569 task-hardcore-herschel-bd08 running idle 5h22m3s ago Listed root directory contents, working directory reset
50f92138-f463-4f2b-abad-1816264b065f task-musing-dewdney-f058 running idle 6h3m8s ago Completed arithmetic calculation
```
Fixes https://github.com/coder/internal/issues/907
We convert `workspacesdk.AgentConn` to an interface and generate a mock
for it. This allows writing `coderd` tests that rely on the agent's HTTP
api to not have to set up an entire tailnet networking stack.
This pull request introduces support for external workspace management, allowing users to register and manage workspaces that are provisioned and managed outside of the Coder.
Depends on: https://github.com/coder/terraform-provider-coder/pull/424
* GET /api/v2/init-script - Gets the agent initialization script
* By default, it returns a script for Linux (amd64), but with query parameters (os and arch) you can get the init script for different platforms
* GET /api/v2/workspaces/{workspace}/external-agent/{agent}/credentials - Gets credentials for an external agent **(enterprise)**
* Updated queries to filter workspaces/templates by the has_external_agent field
Instead of creating tasks with a specialized call to `CreateWorkspace`
on the frontend, we instead lift this to the backend and allow the
frontend to simply call `CreateAITask`.
PProf labels segment the code into groups for determing the source of
cpu/memory profiles. Since the web server and background jobs share a
lot of the same code (eg wsbuilder), it helps to know if the load is
user induced, or background job based.
- Adds a query for counting managed agent workspace builds between two
timestamps
- The "Actual" field in the feature entitlement for managed agents is
now populated with the value read from the database
- The wsbuilder package now validates AI agent usage against the limit
when a license is installed
Closescoder/internal#777
# Enhanced OAuth2 and MCP Compliance for API Authentication
This PR improves OAuth2 and MCP (Microsoft Cloud for Sovereignty)
compliance by:
1. Adding RFC 9728 compliant `WWW-Authenticate` headers with resource
metadata URLs
2. Passing the configured `AccessURL` to API key middleware for proper
audience validation
3. Creating specialized CORS handling for OAuth2 and MCP endpoints with
appropriate headers
4. Making the `state` parameter optional in OAuth2 authorization
requests
These changes ensure proper OAuth2 token audience validation against the
configured access URL and improve interoperability with OAuth2 clients
by providing better error responses and metadata discovery.
Signed-off-by: Thomas Kosiewski <tk@coder.com>
### Breaking Change (changelog note):
> User connections to workspaces, and the opening of workspace apps or ports will no longer create entries in the audit log. Those events will now be included in the 'Connection Log'.
Please see the 'Connection Log' page in the dashboard, and the Connection Log [documentation](https://coder.com/docs/admin/monitoring/connection-logs) for details. Those with permission to view the Audit Log will also be able to view the Connection Log. The new Connection Log has the same licensing restrictions as the Audit Log, and requires a Premium Coder deployment.
### Context
This is the first PR of a few for moving connection events out of the audit log, and into a new database table and web UI page called the 'Connection Log'.
This PR:
- Creates the new table
- Adds and tests queries for inserting and reading, including reading with an RBAC filter.
- Implements the corresponding RBAC changes, such that anyone who can view the audit log can read from the table
- Implements, under the enterprise package, a `ConnectionLogger` abstraction to replace the `Auditor` abstraction for these logs. (No-op'd in AGPL, like the `Auditor`)
- Routes SSH connection and Workspace App events into the new `ConnectionLogger`
- Updates all existing tests to check the values of the `ConnectionLogger` instead of the `Auditor`.
Future PRs:
- Add filtering to the query
- Add an enterprise endpoint to query the new table
- Write a query to delete old events from the audit log, call it from dbpurge.
- Implement a table in the Web UI for viewing connection logs.
> [!NOTE]
> The PRs in this stack obviously won't be (completely) atomic. Whilst they'll each pass CI, the stack is designed to be merged all at once. I'm splitting them up for the sake of those reviewing, and so changes can be reviewed as early as possible. Despite this, it's really hard to make this PR any smaller than it already is. I'll be keeping it in draft until it's actually ready to merge.
# Refactor OAuth2 Provider Code into Dedicated Package
This PR refactors the OAuth2 provider functionality by moving it from the main `coderd` package into a dedicated `oauth2provider` package. The change improves code organization and maintainability without changing functionality.
Key changes:
- Created a new `oauth2provider` package to house all OAuth2 provider-related code
- Moved existing OAuth2 provider functionality from `coderd/identityprovider` to the new package
- Refactored handler functions to follow a consistent pattern of returning `http.HandlerFunc` instead of being handlers directly
- Split large files into smaller, more focused files organized by functionality:
- `app_secrets.go` - Manages OAuth2 application secrets
- `apps.go` - Handles OAuth2 application CRUD operations
- `authorize.go` - Implements the authorization flow
- `metadata.go` - Provides OAuth2 metadata endpoints
- `registration.go` - Handles dynamic client registration
- `revoke.go` - Implements token revocation
- `secrets.go` - Manages secret generation and validation
- `tokens.go` - Handles token issuance and validation
This refactoring improves code organization and makes the OAuth2 provider functionality more maintainable while preserving all existing behavior.
# Add MCP HTTP Server Experiment
This PR adds a new experiment flag `mcp-server-http` to enable the MCP HTTP server functionality. The changes include:
1. Added a new experiment constant `ExperimentMCPServerHTTP` with the value "mcp-server-http"
2. Added display name and documentation for the new experiment
3. Improved the experiment middleware to:
- Support requiring multiple experiments
- Provide better error messages with experiment display names
- Add a development mode bypass option
4. Applied the new experiment requirement to the MCP HTTP endpoint
5. Replaced the custom OAuth2 middleware with the standard experiment middleware
The PR also improves the `Enabled()` method on the `Experiments` type by using `slices.Contains()` for better readability.
# Add MCP HTTP server with streamable transport support
- Add MCP HTTP server with streamable transport support
- Integrate with existing toolsdk for Coder workspace operations
- Add comprehensive E2E tests with OAuth2 bearer token support
- Register MCP endpoint at /api/experimental/mcp/http with authentication
- Support RFC 6750 Bearer token authentication for MCP clients
Change-Id: Ib9024569ae452729908797c42155006aa04330af
Signed-off-by: Thomas Kosiewski <tk@coder.com>