* https://github.com/coder/coder/pull/21493
* https://github.com/coder/coder/pull/21496
* https://github.com/coder/coder/pull/21530
NB these commits were originally authored by Blink on behalf of
@dannykopping, so amended to reflect actual authorship.
**Repro/Verification Steps:**
* Created a Coder deployment with a non-public schema via Docker compose
on v2.28.6:
* Created a DB init script under `db-init/01-create-schema.sql` with the
following:
```sql
CREATE SCHEMA IF NOT EXISTS coder AUTHORIZATION coder;
GRANT ALL PRIVILEGES ON SCHEMA coder TO coder;
ALTER ROLE coder SET search_path TO coder;
```
* Mounted above inside the `postgres` container:
```diff
volumes:
- coder_data:/var/lib/postgresql/data
+ - ./db-init:/docker-entrypoint-initdb.d:ro
```
* Edited `CODER_PG_CONNECTION_URL` to update the search path:
```diff
environment:
- CODER_PG_CONNECTION_URL:
"postgresql://${POSTGRES_USER:-username}:${POSTGRES_PASSWORD:-password}@database/${POSTGRES_DB:-coder}?sslmode=disable"
+ CODER_PG_CONNECTION_URL:
"postgresql://${POSTGRES_USER:-username}:${POSTGRES_PASSWORD:-password}@database/${POSTGRES_DB:-coder}?sslmode=disable&search_path=coder"
```
* Brought up the deployment:
```shell
CODER_VERSION=v2.28.6 CODER_ACCESS_URL=http://localhost:7080
POSTGRES_USER=coder POSTGRES_PASSWORD=coder docker compose up`
```
* Created user / template / workspace
* Updated to `v2.29.1`:
* ```shell
CODER_VERSION=v2.29.1 CODER_ACCESS_URL=http://localhost:7080
POSTGRES_USER=coder POSTGRES_PASSWORD=coder docker compose up`
```
* Observed following error:
```
database-1 | 2026-01-21 15:07:17.629 UTC [102] ERROR: relation
"public.workspace_agents" does not exist
coder-1 | Encountered an error running "coder server", see "coder server
--help" for more information
database-1 | 2026-01-21 15:07:17.629 UTC [102] STATEMENT: CREATE INDEX
IF NOT EXISTS workspace_agents_auth_instance_id_deleted_idx ON
public.workspace_agents (auth_instance_id, deleted);
coder-1 | error: connect to postgres: connect to postgres: migrate up:
up: 2 errors occurred:
coder-1 | * run statement: migration failed: relation
"public.workspace_agents" does not exist in line 0: CREATE INDEX IF NOT
EXISTS workspace_agents_auth_instance_id_deleted_idx ON
public.workspace_agents (auth_instance_id, deleted);
coder-1 | (details: pq: relation "public.workspace_agents" does not
exist)
coder-1 | * commit tx on unlock: pq: Could not complete operation in a
failed transaction
coder-1 exited with code 1
```
* Built image locally:
```console
$ make build/coder_$(./scripts/version.sh)_linux_amd64.tag
...
ghcr.io/coder/coder:v2.29.1-devel-e8c482a98a67-amd64
```
* Started with new image:
```shell
CODER_VERSION=v2.29.1-devel-e8c482a98a67-amd64
CODER_ACCESS_URL=http://localhost:7080 POSTGRES_USER=coder
POSTGRES_PASSWORD=coder docker compose up
```
* Observed migrations ran successfully and Coder came up successfully
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: Danny Kopping <danny@coder.com>
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
relates to: https://github.com/coder/internal/issues/1094
This is number 2 of 5 pull requests in an effort to add agent script
ordering. It adds a drpc API that is exposed via a local socket. This
API serves access to a lightweight DAG based dependency manager that was
inspired by systemd.
In follow-up PRs:
* This unit manager will be plumbed into the workspace agent struct.
* CLI commands will use this agentsocket api to express dependencies
between coder scripts
I used an LLM to produce some of these changes, but I have conducted
thorough self review and consider this contribution to be ready for an
external reviewer.
Adds columns to track package and test name to test_databases table, and populates them as databases are created using the Broker.
In order to seamlessly work with existing `coder_database` databases with the old schema, the SQL that creates the table and columns is additive and idempotent, so we run it every time we initialize the Broker (once per test binary execution).
We include a transaction level advisorly lock to prevent deadlocks before attempting to alter the schema. I was seeing deadlocks without this.
Relates to https://github.com/coder/internal/issues/1093
This is the first of N pull requests to allow coder script ordering.
It introduces what is for now dead code, but paves the way for various
interfaces that allow coder scripts and other processes to depend on one
another via CLI commands and terraform configurations.
The next step is to add reactivity to the graph, such that changes in
the status of one vertex will propagate and allow other vertices to
change their own statuses.
Concurrency and stress testing yield the following:
CPU Profile:
<img width="1512" height="862" alt="Screenshot 2025-10-17 at 10 38 52"
src="https://github.com/user-attachments/assets/f46cf1a2-a0b2-4c02-81a0-069798108ee5"
/>
Mem Profile:
<img width="1512" height="862" alt="Screenshot 2025-10-17 at 10 38 01"
src="https://github.com/user-attachments/assets/45be1235-fff6-45ba-a50d-db9880377bd0"
/>
Predictably, lock contention and memory allocation are the largest
components of this system under stress. Nothing seems untoward.
fixes https://github.com/coder/internal/issues/1026
Thru a (perhaps too-) clever hack of `init()` functions, I've managed to remove the need to separately compile the cleaner binary. This should fix the flakes we are seeing were the binary compilation takes 10s of seconds on macOS. The cleaner is encorporated directly into the test binary and we self-exec as the subprocess.
We're seeing some timeouts from starting the db cleaner, e.g. https://github.com/coder/internal/issues/1026
My suspicion is that in CI the go build cache might not be warm, and so it can take a while to compile and run the dbcleaner subprocess.
This fix builds the cleaner once prior to starting a test run via `make test` to ensure we have a warm cache.
# Canonicalize API Key Scopes
This PR introduces canonical API key scopes with a `coder:` namespace prefix to avoid collisions with low-level resource:action names. It:
1. Renames special API key scopes in the database:
- `all` → `coder:all`
- `application_connect` → `coder:application_connect`
2. Adds support for a new `scopes` field in the API key creation request, allowing multiple scopes to be specified while maintaining backward compatibility with the singular `scope` field.
3. Updates the API documentation to reflect these changes, including the new endpoint for listing public API key scopes.
4. Ensures backward compatibility by mapping between legacy and canonical scope names in relevant code paths.
# Add support for low-level API key scopes
This PR adds support for fine-grained API key scopes based on RBAC resource:action pairs. It includes:
1. A new endpoint `/api/v2/auth/scopes` to list all public low-level API key scopes
2. Generated constants in the SDK for all public scopes
3. Tests to verify scope validation during token creation
4. Updated API documentation to reflect the expanded scope options
The implementation allows users to create API keys with specific permissions like `workspace:read` or `template:use` instead of only the legacy `all` or `application_connect` scopes.
Fixes#19847
# Generate RBAC scope name constants
This PR adds a new generated file `coderd/rbac/scopes_constants_gen.go` that contains typed constants for all RBAC scope names in the format `Scope<Resource><Action>`. For example, `ScopeWorkspaceRead` for the scope "workspace:read".
These constants make it easier to reference specific scopes in code without using string literals, improving type safety and making refactoring easier.
The PR:
- Adds a new template file `scripts/typegen/scopenames.gotmpl`
- Updates the typegen script to support generating scope name constants
- Updates the Makefile to include the new generated file in build targets
Added a script/linter to ensure all `policy.RBACPermissions` entries are part of the `api_key_scope` enumerated in the `coderd/database/dump.sql` file.
Fixes#19846
In order to get `make test` to reliably pass again on our dogfood
workspaces, we're having to resort to setting parallelism.
It also reworks our CI to call the `make test` target, instead of
rolling a different command.
Behavior changes:
* sets 8 packages x 8 tests in parallel by default on `make test`
* by default, removes the `-short` flag. In my testing it makes only a
few seconds difference on ~200s, or 1-2%
* by default, removes the `-count=1` flag that busts Go's test cache.
With a fresh cache and no code changes, `make test` executes in ~15
seconds.
Signed-off-by: Spike Curtis <spike@coder.com>
Fixes https://github.com/coder/internal/issues/907
We convert `workspacesdk.AgentConn` to an interface and generate a mock
for it. This allows writing `coderd` tests that rely on the agent's HTTP
api to not have to set up an entire tailnet networking stack.
Closes https://github.com/coder/internal/issues/884
We're adding this as a `go run` in `lint/go` for now, since adding it to
golangci-lint ourselves involves recompiling golangci-lint and then
running that new binary. I'll look into proposing it being added to the
public golangci-lint linters.
Doesn't appear to cause the lint ci job to take any longer, which is
nice.
### Description
This PR introduces GPG signing for all Coder *slim-binaries*.
Detached signatures will allow users to verify the integrity and
authenticity of the binaries they download.
### Changes
* `scripts/sign_with_gpg.sh`: New script to sign a given binary
using GPG. It imports the release key, signs the binary, and
verifies the signature.
* `scripts/build_go.sh`: Updated to call `sign_with_gpg.sh` when the
`CODER_SIGN_GPG` environment variable is set to 1.
* `.github/workflows/release.yaml`: The` CODER_SIGN_GPG` environment
variable is now set to 1 during the release build, enabling GPG
signing for all release binaries.
* `.github/workflows/ci.yaml`: The `CODER_SIGN_GPG` environment
variable is now set to 1 during the CI build, enabling GPG
signing for all CI binaries.
* `Makefile`: Detached signatures are moved to the `/site/out/bin/
`directory
# Organize Development Documentation into Separate Files
This PR reorganizes the development documentation by splitting the monolithic CLAUDE.md file into multiple focused documents. The main file now provides a concise overview with essential commands and critical patterns, while importing detailed content from specialized guides.
Key improvements:
- Created separate documentation files for specific domains:
- Database development patterns
- OAuth2 implementation guidelines
- Testing best practices
- Troubleshooting common issues
- Development workflows and guidelines
- Restructured the main CLAUDE.md to be more scannable with improved formatting
- Added quick-reference tables for common commands
- Maintained all existing content while making it more accessible
- Highlighted critical patterns that must be followed
This organization makes the documentation more maintainable and easier to navigate, allowing developers to quickly find relevant information for their specific tasks.
This PR introduces failing test retries in CI for e2e tests, Go tests
with the in-memory database, Go tests with Postgres, and the CLI tests.
Retries are not enabled for race tests.
The goal is to reduce how often flakes disrupt developers' workflows.
We replace timestamps in our golden files to keep the values constant.
However, if a non-UTC timezone is used then the timestamp will still be
replaced but the whitespace will be messed up (since it was aligned to
the original value).

Therefore we must force a timezone when generating golden files.
Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Updating golden files is an unnecessary extra step in addition to gen
that is easily overlooked, leading to the developer noticing the issue
in CI leading to lost developer time waiting for tests to complete.
Fixes https://github.com/coder/coder/issues/16268
- Adds `/api/v2/workspaceagents/:id/containers` coderd endpoint that allows listing containers
visible to the agent. Optional filtering by labels is supported.
- Adds go tools to the `coder-dylib` CI step so we can generate mocks if needed
Replace Depot build action with Nix for Nix dogfood image builds
The dogfood Nix image is now built using Nix's native container tooling instead of Depot. This change:
- Adds Nix setup steps to the GitHub Actions workflow
- Removes the Dockerfile.nix in favor of a Nix-native container build
- Updates the flake.nix to support building Docker images
- Introduces a hash file to track Nix-related changes
- Updates the vendorHash for Go dependencies
Change-Id: I4e011fe3a19d9a1375fbfd5223c910e59d66a5d9
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Using `go run` inside of a test is fragile, because it means we have to
wait for `go` to compile the binary while also constrained on resources
by the fact that Playwright and coderd are already running. We should
instead compile a coder binary for the current platform before the tests
and use it directly.
We have an effort underway to replace `dbmem` (#15109), and consequently
we've begun running our full test-suite (with Postgres) on all supported
OSs - Windows, MacOS, and Linux, since #15520.
Since this change, we've seen a marked decrease in the success rate of
our builds on `main` (note how the Windows/MacOS failures account for
the vast majority of failed builds):

We're still investigating why these OSs are a lot less reliable. It's
likely that the VMs on which the builds are run have different
characteristics from our Ubuntu runners such as disk I/O, network
latency, or something else.
**In the meantime, we need to start trusting CI failures in `main`
again, as the current failures are too noisy / vague for us to
correct.**
We've also considered hosting our own runners where possible so we can
get OS-level observability to rule out some possibilities.
See the [meeting
notes](https://www.notion.so/coderhq/CI-Investigation-Call-Notes-17dd579be59280d8897cc9fe4bb46695?pvs=6&utm_content=17dd579b-e592-80d8-897c-c9fe4bb46695&utm_campaign=T1ZPT2FL0&n=slack&n=slack_link_unfurl)
where we linked into this for more detail.
This PR introduces several changes:
1. Moves the full test-suite with Postgres on Windows/MacOS to the
`nightly-gauntlet` workflow
tradeoff: this means that any regressions may be more difficult to
discover since we merge to main several times a day
2. Run only the CLI test-suite on each PR / merge to `main` on
Windows/MacOS
3. `test-go` is still running the full test-suite against all OSs
(including the CLI ones), but will soon be removed once #15109 is
completed since it uses `dbmem`
4. Changes `nightly-gauntlet` to run at 4AM: we've seen several
instances of the runner being stopped externally, and we're _guessing_
this may have something to do with the midnight UTC execution time, when
other cron jobs may run
5. Removes the existing `nightly-gauntlet` jobs since they haven't
passed in a long time, indicating that nobody cares enough to fix them
and they don't provide diagnostic value; we can restore them later if
necessary
I've manually run both these new workflows successfully:
- `ci`:
https://github.com/coder/coder/actions/runs/12825874176/job/35764724907
- `nightly-gauntlet`:
https://github.com/coder/coder/actions/runs/12825539092
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: Muhammad Atif Ali <atif@coder.com>
RE: https://github.com/coder/coder/issues/15740,
https://github.com/coder/coder/issues/15297
In order to add a graph to the coder frontend to show user status over
time as an indicator of license usage, this PR adds the following:
* a new `api.insightsUserStatusCountsOverTime` endpoint to the API
* which calls a new `GetUserStatusCountsOverTime` query from postgres
* which relies on two new tables `user_status_changes` and
`user_deleted`
* which are populated by a new trigger and function that tracks updates
to the users table
The chart itself will be added in a subsequent PR
---------
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
- add `.PHONY` to some jobs where it was missing
- improve the test-e2e job by ensuring the frontend build is up to date
- some small correctness tweaks