Commit Graph

1374 Commits

Author SHA1 Message Date
Zach 170c33a475 feat: encrypt gitsshkeys.private_key at rest via dbcrypt (#25872)
Adds an optional dbcrypt wrapper around gitsshkeys.private_key. The
column is encrypted on insert and update through enterprise/dbcrypt when
external token encryption is configured, and decrypted on read.

A new private_key_key_id column references
dbcrypt_keys(active_key_digest) so revocation safety is enforced by the
existing foreign key. Rows with a NULL key_id stay plaintext and remain
readable. Existing plaintext rows can be backfilled by running `coder
server dbcrypt rotate`.

Generated with assistance from Coder Agents.
2026-06-02 08:36:01 -06:00
Thomas Kosiewski f6a4ed309f ci: fix Windows runner PATH casing for mise, not in cli (#25972)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 10:46:40 +00:00
Paweł Banaszewski f22d4e2cbb feat: add ai_gateway_keys table and related RBAC (#25563)
Adds table to store keys that AI Gateway standalone replicas will use
to authenticate into Coderd.
Also adds RBAC and audit boilerplate.
2026-06-02 09:28:43 +02:00
Mathias Fredriksson ed4311b2cb ci: add Git usr/bin to PATH on Windows (#25939)
## Summary

Fixes all 9 Windows CI test failures caused by the mise CI refactor
(`fe257666d7`, PR #25727).

### Root cause

`jdx/mise-action` exports `Path` (Windows convention) via `GITHUB_ENV`.
Bash on Windows maintains its own `PATH`. When Go's `os.Environ()`
returns both, `cmd.exe` subprocesses non-deterministically pick the
MSYS-translated `PATH` (forward slashes), causing Windows executables
(`printf`, `powershell.exe`, `cmd.exe`) to be unresolvable.

These failures only appeared on `main` (where `-count=1` forces real
test execution) and were masked on PRs by Go test cache.

### Fixes applied

**CI (`setup-mise` action)**:
- Write both `Path` and `PATH` to `GITHUB_ENV` with Git usr/bin
prepended

**Code (`cli/root.go`)**:
- Add `appendAndDedupEnv` helper that deduplicates case-insensitive env
vars on Windows, preferring native Windows paths (backslashes) over MSYS
paths

**Code (`cli/configssh_windows.go`)**:
- Use absolute paths for `powershell.exe` and `cmd.exe` in the SSH
config `Match exec` escape function, avoiding PATH resolution entirely

**Tests**:
- Switch `--header-command` tests from `printf` to `echo` (cmd.exe
builtin) for reliable cross-platform execution
- Add env dedup in `Test_sshConfigMatchExecEscape` for subprocess PATH
consistency

Fixes coder/internal#1556, coder/internal#1558, coder/internal#1559

> 🤖 Generated by Coder agent, will be reviewed by @mafredri. 🏂🏻
2026-06-02 11:51:16 +10:00
Danny Kopping a85462bd49 feat: support adding GitHub Copilot AI provider via UI (#25888)
Copilot is the only AI provider type that could not be added through the `/ai/settings` UI. The aibridge runtime and the env-var seeding path already supported it, but the runtime CRUD API rejected `type=copilot` and the UI omitted it entirely. The root cause is that Copilot's auth model (a per-request GitHub OAuth token, with no pre-shared key) does not fit the credential-centric add-provider flow that every other provider uses.

## Backend

Allow `type=copilot` in `CreateAIProviderRequest.Validate()`, and reject `api_keys` for Copilot on both create (validation) and update (handler sentinel), mirroring the existing Bedrock guards. Copilot carries no stored credential.

## Frontend

Add Copilot to the provider type picker (with the `github-copilot.svg` icon) and give the form a credential-free branch: name, display name, and a free-text endpoint defaulting to `https://api.business.githubcopilot.com`, with copy explaining that authentication happens via the user's GitHub token at request time. Copilot maps to the distinct `copilot` wire type rather than collapsing to `openai`, and the edit flow recovers it correctly.

The endpoint stays required with a business-tier default; users on the individual or enterprise endpoints edit the field.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-06-01 15:26:37 +02:00
Danny Kopping c8555e2163 fix: deprecate ai provider seeding env config (#25854)
Environment variables used to configure AI Gateway providers are now deprecated, and we need to reflect this as such.
2026-06-01 15:15:47 +02:00
Yevhenii Shcherbina 1a91d31793 feat: add user AI budget override endpoints (#25439)
Implements https://linear.app/codercom/issue/AIGOV-285
Follow the structure established in
https://github.com/coder/coder/pull/25203

## Summary

Adds the `user_ai_budget_overrides` table and CRUD API at
`/api/v2/users/{user}/ai/budget`. An override sets a custom per-user
spend cap that supersedes group-budget resolution, attributing spend to
a specific group.

## Schema

```sql
CREATE TABLE user_ai_budget_overrides (
    user_id            UUID        PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
    group_id           UUID        NOT NULL REFERENCES groups(id) ON DELETE CASCADE,
    spend_limit_micros BIGINT      NOT NULL CHECK (spend_limit_micros >= 0),
    created_at         TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at         TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
```

## Membership lifecycle

The membership invariant — a user must be a member of the attributed
group, including when that group is "Everyone" — would naturally be
expressed as a composite FK on `(user_id, group_id) →
group_members_expanded(user_id, group_id)`. PostgreSQL doesn't allow
foreign keys to reference views, so enforcement is split across two
mechanisms:

- **Write-time check.** A CHECK constraint on the table
(`user_ai_budget_overrides_must_be_group_member`) calls a `STABLE`
function `is_group_member(user_id, group_id)` that queries
`group_members_expanded`. The view surfaces both regular group
memberships and the implicit "Everyone" group memberships from
`organization_members`. Any INSERT or UPDATE that violates the predicate
is rejected with a Postgres `check_violation`, which the handler maps to
a 400. `is_group_member` is defined as a general predicate, reusable by
any future table that needs the same check.

- **Cascade on removal.** Two `BEFORE DELETE` triggers handle membership
loss:
- `trigger_delete_user_ai_budget_overrides_on_group_member_delete` on
`group_members` — covers regular group removals (admin action, OIDC
sync).
- `trigger_delete_user_ai_budget_overrides_on_org_member_delete` on
`organization_members` — covers the "Everyone" group, whose membership
lives in `organization_members`.

The single-column FKs on `users(id)` and `groups(id)` remain to cascade
on user or group deletion (those paths don't pass through
`group_members`).

## Authorization

The dbauthz layer gates each operation against the `User` and (for
writes) `Group` resources:

| Operation | User resource  | Group resource |
|-----------|----------------|----------------|
| `GET`     | `ActionRead`   | —              |
| `PUT`     | `ActionUpdate` | `ActionUpdate` |
| `DELETE`  | `ActionUpdate` | `ActionUpdate` |

For `DELETE`, the dbauthz layer fetches the existing override first to
learn the attributed `group_id`, then runs both checks.

### Role matrix

| Role         | GET | PUT | DELETE |
|--------------|-----|-----|--------|
| Owner        |    |    |       |
| UserAdmin    |    |    |       |
| OrgAdmin     |    |    |       |
| OrgUserAdmin |    |    |       |

Internal discussion:
https://codercom.slack.com/archives/C096PFVBZKN/p1779392747885359

## Audit logs
Audit logs will be addressed in a follow-up PR.
2026-05-29 10:08:25 -04:00
Danny Kopping 5b10268827 feat: serve 503 sentinel for disabled providers (#25794)
_Disclosure: created with Coder Agents._

When providers are disabled, we should serve a sentinel error so the
requesting client (Claude Code, Coder Agents, etc) is informed. Coder
Agents can also conditionalize its display to show a helpful error
message.

---------

Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 10:24:16 +02:00
Steven Masley 4591212482 feat: implement SCIM handler for SCIM 2.0 compliance (#25572)
Rewrites the SCIM 2.0 user provisioning handler to be RFC 7644
compliant. Verified against an external IdP Okta.

Behavior is OPT IN
2026-05-28 10:00:37 -05:00
Danny Kopping 12520ee964 feat: add ai provider status and reload freshness metrics (#25770)
Add metrics for `aibridged` and `aibridgeproxyd`'s provider statuses. AI providers can be modified, and possibly misconfigured, at runtime. These metrics help operators understand the state of these provider definitions in case unexpected behaviour is observed.
2026-05-28 14:57:33 +02:00
Danny Kopping a9f5ed7644 fix: re-validate provider per request and classify reloads (#25766)
Refactors the `aibridgeproxyd` provider reload mechanism which was unnecessarily complex.

Also ensures that providers are evaluated on each CONNECT request to prevent interception of requests to (newly) disabled providers; in this case the requests will passthrough unencrypted, by design.
2026-05-28 13:22:38 +02:00
Callum Styan 9d37f63fbd feat: report synthetic metadata from fake agents (#25166)
Fake agents now fetch their manifest, spawn a single per-agent metadata
goroutine, and emit batched BatchUpdateMetadata calls with 3072-byte
base64 payloads so scaletest runs mirror the load shape of real agents.
This matches what the current scaletest workspace template does for
metadata. In the future we can extend the harness here to take in a
config option for the metadata payload size.

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Mux <mux@coder.com>
2026-05-27 13:49:42 -07:00
Danny Kopping 79e007cf30 feat: hot-reload aibridged and aibridgeproxyd providers on DB changes (#25673)
Previously the in-process aibridge daemon and the enterprise aibridgeproxy daemon both snapshotted their provider routing once at boot. Any `ai_providers` or `ai_provider_keys` mutation required a restart for either to pick it up.

Add an `ai_providers_changed` pubsub channel that the CRUD handlers publish on after Create / Update / Delete. Both daemons subscribe:

- **aibridged** rebuilds its `[]aibridge.Provider` snapshot via `BuildProviders` and swaps it into the pool atomically. Inflight requests keep serving against the bridge they already acquired; new acquires build against the new snapshot. Per-provider construction errors stay scoped to the offending row.
- **aibridgeproxyd** rebuilds its routing snapshot from `GetAIProviders` and swaps the host→provider map atomically. The MITM listener picks up new providers without restart.

DB read for aibridgeproxyd uses the existing `AsAIProviderMetadataReader` subject for routing-only access.
2026-05-27 11:58:43 +02:00
Cian Johnston d3155e1cab test(enterprise/cli): add test to prove fix for #25699 (#25701)
Adds an end-to-end enterprise CLI test to ensure legacy AI provider keys seeded at server startup are encrypted at rest when DBCrypt external token encryption is enabled, preventing regressions related to #25699.

> Partially implemented by Coder Agents, and massaged afterwards by me.
2026-05-26 20:08:07 +00:00
Michael Suchacz 8b1705eb65 feat: route chatd provider traffic through aibridge (#25629)
## Summary

Routes chatd model calls backed by concrete AI Provider rows through the
in-process aibridge transport by default, with deployment options to use
direct provider routing when AI Gateway is disabled or chat AI Gateway
routing is disabled.

- Splits model routing into common, direct provider, and AI Gateway
paths behind a single deployment-mode entry point.
- Builds chatd models through explicit request, route, and options data.
Active API key attribution is passed explicitly instead of being hidden
inside generic model construction.
- For AI Gateway BYOK routes, resolves the user's provider key in chatd,
forwards it through provider-specific auth headers, and sets
`X-Coder-AI-Governance-Token` to the `delegated` marker so aibridge
preserves those headers while still stripping Coder-specific metadata.
- Keeps central provider credentials and deployment fallback credentials
out of forwarded provider auth headers, so AI Gateway central policy
remains authoritative.
- Redacts delegated provider auth from default string formatting to
avoid accidental plaintext logging of user BYOK credentials.
- Covers selected chat models, advisor overrides, title and quickgen
paths, subagent overrides, computer use model selection, and an
integration-style chat turn through the aibridge transport path.
- Persists initiating API key IDs on chat and queued user messages,
including subagent child messages, and fails closed for AI
Gateway-routed model builds without an active key.
- Removes unused `api_key_id` indexes while keeping the persistence
columns and foreign keys.
- Keeps the deployment option available through config and env parsing,
but hides it from CLI help and generated docs.
- Stabilizes the subagent poll fallback test so background CreateChat
processing cannot win the state transition under slower CI environments.

## Tests

- `go test ./coderd/x/chatd -run
'TestAIGatewayProviderAuthForUser|TestAIGatewayProviderAuthRedactsFormatting|TestResolveModelRouteForConfigAIGatewayProviderAuth|TestAIGatewayModelForwardsProviderAuth|TestProcessChat_AIGatewayRoutingUsesDelegatedAPIKey|TestAwaitSubagentCompletion'
-count=1`
- `go test ./coderd/aibridged -run
'TestServeHTTP_DelegatedAPIKey|TestServeHTTP_StripCoderToken' -count=1`
- `git diff --check HEAD~1..HEAD`
- `make lint`

> Mux working on behalf of Mike.
2026-05-26 19:31:52 +00:00
Danny Kopping a56c88a0cc fix: run AI provider seed and build after newAPI so dbcrypt applies (#25699)
## Problem

Two related symptoms of the same architectural issue: the `dbcrypt`
wrapper is installed inside `enterprise/coderd.New`, so any access to
`options.Database` that happens before `newAPI` runs bypasses
encryption.

**Symptom 1 (reads):** Provider keys added via the admin UI are
encrypted at rest. `BuildProviders` was running *before* `newAPI`,
against the unwrapped store, so the ciphertext was read as-is and shoved
into the keypool as the upstream credential. Anthropic/OpenAI reject it,
and the interception log shows:

```
coderd.aibridged.pool: interception failed  ... error="all configured keys failed authentication"
  credential_kind=centralized  credential_hint=PaPb...4A==  credential_length=184
```

**Symptom 2 (writes):** `SeedAIProvidersFromEnv` was also running before
`newAPI`, against the unwrapped store, so env-derived keys
(`CODER_AIBRIDGE_OPENAI_KEY`, indexed `CODER_AIBRIDGE_PROVIDER_<N>_KEY`,
etc.) landed in `ai_provider_keys` as plaintext with `ApiKeyKeyID =
null` even when `CODER_EXTERNAL_TOKEN_ENCRYPTION_KEYS` was set.

## Fix

Move both `SeedAIProvidersFromEnv` and `BuildProviders` to after
`newAPI`, where `options.Database` is the dbcrypt-wrapped store. Writes
encrypt correctly; reads decrypt correctly.

The enterprise closure (`enterprise/cli/server.go`) runs *inside*
`newAPI` and calls `BuildProviders` for the aibridgeproxyd at that
point. Once the agpl seed moves to after `newAPI`, the proxy on first
boot would see no env-seeded providers. Add a matching seed call inside
the enterprise closure before its `BuildProviders` to cover that case.
Seeding is idempotent, so the agpl-side seed running again post-`newAPI`
is a no-op when the rows already exist.

## Known shortcomings

The clean version of this fix would just inherit `ctx` like every other
startup step and place these calls naturally. It can't, for two reasons
that are both about the surrounding handler architecture rather than
this change:

1. **`dbcrypt` wrapping is positioned inside `newAPI`, not around
`options.Database` at creation.** That's why both seed and build have to
wait until after `newAPI` in the first place. The principled fix is to
install the wrapper at the point the store is created (behind a hook the
enterprise build supplies), so every consumer sees a single
authoritative view and the ordering stops mattering. This would also
collapse the duplicated seed call back to a single site.

2. **The handler's shutdown sequence is not deferred.**
`coderAPICloser.Close()` and the other teardown steps run only if
control reaches the `select` at the bottom of the handler. An early
`return` from anywhere in Phase 1 (e.g. seed/build returning
`context.Canceled` when the user hits ctrl-c during startup) skips that
block and orphans all the goroutines `newAPI` spawned — tailnet workers,
gitsync, telemetry batcher, etc. `goleak` then catches them at package
teardown and `TestServer_TelemetryDisabled_FinalReport` fails. Moving
the shutdown into deferred closers (with a `sync.Once`-guarded close to
avoid double-close from the explicit Phase 2 call) is the principled
fix.

For this PR I took the smallest change that fixes the reported bugs: a
detached context (`context.WithoutCancel(ctx)` + a 30s timeout) at the
seed and build call sites in both the agpl and enterprise paths. It lets
the calls complete even if the user cancels during startup, after which
the handler reaches its shutdown select naturally and tears down through
Phase 2. Both shortcomings above are worth addressing separately.

## Test plan

- `make test RUN=TestServer_TelemetryDisabled_FinalReport` with `-race`;
passes locally with `-count=3`.
- Manually verified on a deployment with
`CODER_EXTERNAL_TOKEN_ENCRYPTION_KEYS` set and env-configured providers:
`ai_provider_keys.api_key_key_id` is populated, `api_key` is base64
ciphertext, and upstream auth succeeds.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 21:27:02 +02:00
Danny Kopping 282ab7de34 refactor: load AI providers from the database at startup (#25672)
Replace the env-based `BuildProviders` with a DB-backed loader. The database is now the single source of truth for runtime provider configuration; env config arrives via `SeedAIProvidersFromEnv` (run at boot) and `BuildProviders` reads it back as `aibridge.Provider` instances. `cli/server.go` and `enterprise/cli/server.go` both call the same path, so aibridged and aibridgeproxyd see the same provider set.

Per-provider `DumpDir` is replaced by a top-level `CODER_AI_GATEWAY_DUMP_DIR` base; each provider's effective dump path is `<base>/<provider name>`.
2026-05-26 15:57:01 +02:00
Danny Kopping c801dcbbc8 fix: strip route prefix when passing request to aibridged handler (#25671)
We weren't stripping the API base (`/api/v2/aibridge`), leading to 404s
when using the in-memory transport.

Signed-off-by: Danny Kopping <danny@coder.com>
2026-05-26 08:04:26 +00:00
Danny Kopping 4ddda3a9db feat: filter interceptions and sessions by provider name (#25640)
Allows filtering sessions & interceptions by provider name, and adds a test to vaidate that provider name is immutable (at least until #25606 lands).
2026-05-25 16:31:48 +02:00
Danny Kopping ef6ee2af68 chore: tolerate empty providers at startup and log env seeds (#25605)
Since AI Gateway is now enabled by default, and if the AI Gateway Proxy is enabled too it's possible the server can start without any configured providers. This would previously block startup, which is unacceptable.

In an upstack PR we will handle reloading the providers at runtime, so the server needs to be able to start up even if it can't handle any proxy requests to AI Gateway.

This change was necessitated because if there are providers configured in the environment they need to be seeded _before_ the proxy starts.
2026-05-22 12:45:14 +02:00
Michael Suchacz ca1f6b19a2 feat: remove legacy chat provider tables (#25416) 2026-05-22 09:50:01 +02:00
Danny Kopping ddec110b0e refactor: move aibridged out of enterprise to AGPL (#25570)
In order to allow Coder Agents to use AI Gateway in OSS, we need to rehome the `aibridged`\-related code into the AGPL path.

The HTTP API is only registered under enterprise so will still require the AI Governance Add-on to be present in order to use it, whereas Coder Agents uses an in-memory pipe to the same handlers.
2026-05-22 09:11:37 +02:00
Danny Kopping c50b0e84b9 feat!: default CODER_AI_GATEWAY_ENABLED to true (#25575)
`CODER_AI_GATEWAY_ENABLED` / `CODER_AIBRIDGE_ENABLED` is now being defaulted to `true` now that it will be used by Coder Agents.

If you previously had this value disabled explicitly, that value will persist.
2026-05-22 08:57:36 +02:00
Danny Kopping 9341efec9f feat!: seed ai_providers from env on server startup (#24895)
_Disclaimer: implemented by a Coder Agent using Claude Opus 4.7_

Part of the implementation of [RFC: Common AI Provider Configs](https://www.notion.so/coderhq/RFC-Common-AI-Provider-Configs-34bd579be59280ed958feffb82024797) (AIGOV-201).

## Note

This change can cause a previously working installation to fail to start should a conflict exist between the providers configured in the environment & those now migrated to the database.

I'll raise a PR upstack to document this process and workarounds should a startup fail.

## What this PR does

Reconciles environment-derived AI provider configuration with the `ai_providers` table at server startup. The seed runs **before** the aibridged daemon is initialized, so the runtime always reads providers from the database; the legacy `CODER_AIBRIDGE_*` environment variables become a one-shot migration source.

### Behavior

- Concurrent server starts are serialized through a Postgres advisory lock (`LockIDAIProvidersEnvSeed`).
- Missing rows are inserted with an audit entry attributed to the system actor.
- Existing rows whose canonical hash matches the env-derived hash are left alone (the common no-op restart path).
- Existing rows whose canonical hash does **not** match cause server startup to fail with a descriptive error so the operator can explicitly resolve the conflict in either env or DB.
- Soft-deleted rows are NOT resurrected from env; an explicit operator deletion is sticky across restarts.
- Indexed providers whose name conflicts with a legacy env var fail startup with a clear remediation message.
- Unknown provider types (e.g. `copilot`, until the DB enum is widened) are skipped with a log entry rather than failing startup.

### Canonical hashing

The `canonicalAIProvider` shape captures exactly the fields that determine runtime behavior — `type`, `base_url`, and the Bedrock subset of settings (access key, access key secret, region, model, small fast model) — and is hashed with SHA-256. The hash is **computed on demand from the row + env**, never persisted, so the database does not need a new column for it. API keys live in the separate `ai_provider_keys` table and are intentionally excluded from the hash so operators can rotate keys via the API without forcing a server restart.

<details>
<summary>Decision log</summary>

- The hash is intentionally not persisted in the database. The RFC discussed this trade-off; computing on demand keeps the schema minimal and lets the canonical shape evolve without a migration.
- The lock uses an `iota` slot in `coderd/database/lock.go` rather than `GenLockID` so it's stable, easy to audit, and matches the convention used for every other startup lock.
- A bearer-token Anthropic provider whose env vars also set Bedrock metadata but no AWS credentials does NOT store the Bedrock fields. Without credentials the discriminated settings would misrepresent the row as Bedrock auth.
- We deliberately do NOT publish to the `ai_providers_changed` pubsub channel from the seed because the seed completes before any subscriber is started; the follow-up PR introduces that channel.

</details>
2026-05-22 08:37:27 +02:00
Michael Suchacz 06526a5822 feat: use AI provider chat APIs (#25415) 2026-05-22 07:53:23 +02:00
Michael Suchacz 40878eeba4 feat: add AI provider schema expansion (#25412) 2026-05-22 02:16:01 +02:00
Jon Ayers 269bd0cb8d fix: skip no-op peer updates in pgcoord binder (#24226) 2026-05-21 17:59:12 -05:00
Paweł Banaszewski 46e93e6325 chore: add ai_gateway options that alias aibridge options (#25061)
Adds options matching new AI Gateway naming.
New options are added as alias for old options. Old options are still
working.
Old options have deprecated message.
No conflict detection was added.

Updated documentation so it mentions only new options. Added note about
old options still working.

> Various AI tools where used to create this PR
2026-05-21 11:14:11 +02:00
Steven Masley 9b6eadab77 fix: drop N+1 db query on template ACL available (#25465)
Fixes
[PLAT-149](https://linear.app/codercom/issue/PLAT-149/template-permissions-search-is-extremely-slow-with-many-groups).

`/acl/available` ran a db query per group. A deployment with >5,000
groups made this route extremely slow.
2026-05-20 22:40:50 +00:00
Spike Curtis 8dc4d76890 chore: add agent-connection-watch for workspaces (#24507)
<!--

If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.

-->

relates to GRU-18  
  
Adds basic implementation for Workspace Agent Connection Watch and tests.  
  
Missing are handling of logs.
2026-05-20 13:09:11 -04:00
Danny Kopping 44b1edd4da fix: unify key-ops audit shape and surface per-key detail (#25534)
Adding missed commit from https://github.com/coder/coder/pull/25484

This formats the audit logs correctly

![image.png](https://app.graphite.com/user-attachments/assets/598d018b-cdf5-4a2c-8321-24ba2c650a1a.png)



<!--

If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.

-->
2026-05-20 17:33:26 +02:00
Michael Suchacz e105e3af45 test: cover personal skill API (#25365)
> Mux updated this PR on behalf of Mike.

## Stack Context

This PR is the API test coverage slice in the experimental personal
skills stack. The storage, schema, permissions, API, and SDK
implementation merged in #25363.

Stack order:
1. #25362 personal skill resolver
2. #25363 storage, permissions, API, and SDK
3. #25365 API test coverage
4. #25366 chattool and chatd integration
5. #25066 settings UI and docs
6. #25386 personal skills slash menu

## What?

Adds API and audit tests for personal skill CRUD, validation failures,
limits, authorization, soft-delete cleanup, and audit content tracking.

This PR is now test-only. It does not include migrations, generated
database code, or API implementation changes.

## Why?

The feature touches storage, permissions, and audit behavior. These
tests make the server behavior reviewable and protected without
re-reviewing the implementation that already merged in #25363.

## Validation

- `go test ./coderd -run '^(TestUserSkill|TestPatchUserSkill)' -count=1`
- `go test ./enterprise/coderd -run
'^TestUserSkillAuditDiffTracksContent$' -count=1`
- pre-commit hook via `gt modify --no-edit`
2026-05-20 11:27:09 +02:00
Danny Kopping dd3223451b feat: add AI providers HTTP CRUD handlers (#24894) 2026-05-20 10:21:36 +02:00
Michael Suchacz 5a8d0016a5 feat: add personal skill storage, API, and SDK (#25363)
> Mux updated this PR on behalf of Mike.

## Stack Context

This PR is the storage, permissions, API, and SDK layer for experimental
personal skills. #25362 has landed on `main`, so this branch is
restacked directly on `main`.

Stack order:
1. #25363 storage, permissions, API, and SDK
2. #25365 API test coverage
3. #25366 chattool and chatd integration
4. #25066 settings UI and docs
5. #25386 personal skills slash menu

## What?

Adds the `user_skills` database table, generated queries, RBAC resources
and scopes, audit resource handling, experimental user-scoped CRUD
endpoints, SDK types, and generated API/site types.

Follow-up review and restack fixes:
- Enforce a bounded personal skill description in parser and database
constraints.
- Return `403 Forbidden` for unauthorized create and update attempts.
- Return explicit conflict responses when soft-deleted users are
targeted.
- Keep user admins out of personal skills, while site owners can read
and delete but not create or update.
- Document trigger-raised constraint names and keep schema constants
covered by tests.
- Reuse `UserSkillMetadata` in the full `UserSkill` SDK response type.
- Generate user skill IDs in Go instead of relying on a database
default.
- Rebase on latest `main` and renumber the user skills migration to
`000502_user_skills`.

## Why?

Personal skills need durable user-owned storage with owner
authorization, limited site-owner moderation, and a hidden API surface
before chatd can consume them.

## Validation

- `make gen`
- `go test ./coderd/database -run '^TestUserSkillSchemaConstants$'
-count=1`
- `go test ./coderd/database/dbauthz -run
'^TestMethodTestSuite/TestUserSkills$' -count=1`
- `go test ./coderd -run '^TestPatchUserSkill$' -count=1`
- `go test ./codersdk ./coderd/database/db2sdk`
- `make lint`
- pre-commit hook on `97fd58108d`
2026-05-20 00:09:09 +02:00
Steven Masley 51b531f5b3 chore: 'go generate' mockgen to use go tool wrapper (#25490)
Calling `mockgen` relies on the executable in the `$PATH`. Using `go
tool` uses the one defined in `go.mod`
2026-05-19 14:53:13 +00:00
Danielle Maywood 170a6e1fe9 feat: add chat sharing foundation (#25041) 2026-05-18 22:32:05 +01:00
Yevhenii Shcherbina 2732378da2 feat: audit group AI budget mutations (#25374)
Relates to
https://linear.app/codercom/issue/AIGOV-284/add-group-budgets-table-and-crud-api

Adds audit-log support for `group_ai_budget` mutations. Without it, an
admin could silently lower a spend limit from `$500` to `$50` or delete
a budget entirely, with no record of who performed the action.

Both write (`create-or-update`) and delete actions now produce audit log
entries, including before/after diffs for `spend_limit_micros`.

Depends on #25203.

## Old Version
<img width="1340" height="456" alt="image"
src="https://github.com/user-attachments/assets/e9ff52fb-a905-4aef-a4ee-7cdc58e68b75"
/>

## New Version (see
https://github.com/coder/coder/pull/25374/changes/9d22833de87cc106c24142c1d471a3f71872bf67)
<img width="1347" height="496" alt="image"
src="https://github.com/user-attachments/assets/1b9bbfa1-f86d-48e3-a0b1-266eb76f851f"
/>
2026-05-18 15:17:20 -04:00
Marcin Tojek 38772bdb7c refactor: remove cache tokens from ExtraTokenTypes (#25118)
Fixes: https://github.com/coder/aibridge/issues/243

> Generated with [Coder Agents](https://coder.com/agents)
2026-05-18 11:30:19 +02:00
Danny Kopping b7a282d544 fix(enterprise/coderd/prebuilds): downgrade reconciliation stats log to debug (#25340)
*Disclaimer: implemented by a Coder Agent using Claude Opus 4.6*

The `reconciliation stats` log line runs on every reconciliation tick
(every 5 minutes by default), even when there are no presets to
reconcile. In a steady-state installation without prebuilds activity,
this is the only log line that persistently shows up at info level.
Demote it to `debug` so the steady-state log output stays quiet.

Before:

```
2026-05-14 15:01:25.085 [info]  coderd.prebuilds: reconciliation stats  elapsed=1.649153ms  presets_total=0  presets_reconciled=0
```

After: same line is emitted at `debug` level and is suppressed at the
default info log level.
2026-05-18 10:47:41 +02:00
Callum Styan 191dd230ae feat: add agentfake scaletest subcommand (#25072)
This PR builds on top of https://github.com/coder/coder/pull/25070 to
add a way of running the larger "fake agent" manager via the existing
CLI, pulling in the URL/credentials already set.

With this, we can run a pod per scaletest region to act as all the
workspaces in that region.


This is in a new subcommand `scaletest agentfake` currently.

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2026-05-15 14:36:54 -07:00
Danny Kopping 985ae9ecd5 feat(enterprise/dbcrypt): encrypt ai_providers and ai_provider_keys at rest (#25326) 2026-05-15 15:42:02 +02:00
Ethan a59b951565 test: skip stale notification chatd flakes (#25376)
These chatd tests are flaking for the same stale control-notification
race tracked by CODAGT-353, so this change skips the newly reflaking
advisor-chain and `TestPatchChatMessage/ChangesModel` tests and rewrites
the older `TODO(hugodutka)` skips to point at the same root cause. This
keeps the known flakes documented consistently until the chatd
notification-flow refactor lands.

Closes CODAGT-427
Closes https://github.com/coder/internal/issues/1510
2026-05-15 17:36:48 +10:00
Callum Styan 81212470fd feat: implement basic (MVP version) of a fake agent + manager (#25070)
This PR introduces a "fake agent" + manager, which can be used during
scaletests to run a single executable that acts as many workspace
agents. The goals of these are to provide a much lighter weight
implementation of a workspace in terms of resource cost and startup time when executing scaletests.

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Mux <noreply@coder.com>
2026-05-14 14:46:36 -07:00
Yevhenii Shcherbina 238968cfa0 feat: add per-group AI budget table and endpoints (#25203)
Closes
https://linear.app/codercom/issue/AIGOV-284/add-group-budgets-table-and-crud-api

## Summary

Adds the `group_ai_budgets` table and the following endpoints:

- `GET /api/v2/groups/{group}/ai/budget`
- `PUT /api/v2/groups/{group}/ai/budget`
- `DELETE /api/v2/groups/{group}/ai/budget`

Each group may have at most one budget row. If no row exists, no budget
is enforced.

### Feature gate
  
Added `RequireFeatureMW(FeatureAIBridge)` on the `/ai/budget` sub-route.

## RBAC

Authorization reuses `rbac.ResourceGroup` with the existing
`.InOrganization(...).WithID(...)` scoping model.

The `dbauthz` wrappers load the parent `groups` row and authorize
against it.

No new resource type is introduced. As a result, anyone with
`group:update` permissions (Owner, OrgAdmin, or UserAdmin within the
organization) can manage AI budgets for that group.

## Read access for group members

`database.Group.RBACObject()` grants `policy.ActionRead` to all members
of the group through the group ACL:

```go
func (g Group) RBACObject() rbac.Object {
	return rbac.ResourceGroup.WithID(g.ID).
		InOrg(g.OrganizationID).
		// Group members can read the group.
		WithGroupACL(map[string][]policy.Action{
			g.ID.String(): {
				policy.ActionRead,
			},
		})
}
```

Because the `GET` endpoint authorizes against the same loaded `Group`
object, any group member can call:

```text
GET /api/v2/groups/{group}/ai/budget
```

`PUT` and `DELETE` remain admin-only. The group ACL grants only
`ActionRead`, so write operations continue to require role-based
`group:update` permissions.

## Alternative considered

A dedicated `rbac.ResourceGroupAiBudget` resource would allow budget
management to be separated from general group administration.

We decided not to add that complexity for now.
2026-05-14 15:54:37 -04:00
Danielle Maywood 9ddfafe2b1 feat: add chat ACL database foundation (#25080) 2026-05-14 17:18:50 +01:00
Danny Kopping 841b777ccd feat: add ai_providers table, queries, dbauthz, audit, RBAC (#24892) 2026-05-14 16:10:46 +02:00
Yevhenii Shcherbina b5e1ea33d8 feat: add AI budget policy and period deployment config (#25122)
Closes
https://linear.app/codercom/issue/AIGOV-283/add-deployment-config-for-ai-budget-policy-and-period

Adds `CODER_AI_BUDGET_POLICY` and `CODER_AI_BUDGET_PERIOD` deployment
options for AI Governance cost controls.
2026-05-12 10:48:36 -04:00
Kyle Carberry 5a5cd79c4c fix: drop buffered chat parts after their durable message commits (#25164) 2026-05-12 00:30:38 -04:00
Kyle Carberry 0ed57ee343 fix(coderd/x/chatd): checkpoint buffered message_parts to avoid stale replay (#25145) 2026-05-11 17:27:03 -04:00
Steven Masley 19573e8aee feat!: patchTemplateMeta to use optional fields (#24984)
Closes https://github.com/coder/coder/issues/13112

**Breaking Change**: Removed status code `StatusNotModified` when no
diffs occur in a patch. Now the patch is always applied and a template
is always returned.
2026-05-11 12:43:52 -05:00