Commit Graph

262 Commits

Author SHA1 Message Date
Susana Ferreira a613ffa3d6 chore: integrate metrics scanner into Makefile (#21465)
## Description

This PR wires up the metrics scanner in the Makefile to automatically regenerate metrics documentation when source files change.

## Changes

* Add Makefile target `scripts/metricsdocgen/generated_metrics` to run the AST scanner to generate the metrics file
* Update `docs/admin/integrations/prometheus.md` Makefile target to depend on `scripts/metricsdocgen/generated_metrics`
* Add `scripts/metricsdocgen/README.md` documenting the metrics generation process

Closes: https://github.com/coder/coder/issues/13223
2026-02-13 12:31:33 +00:00
Marcin Tojek 456c0bced9 fix: enable strict mode for swagger generation & upgrade swag (#21975)
Adds a Go wrapper (`scripts/apidocgen/swaginit/main.go`) that calls
swag's Go API with `Strict: true`. The `--strict` flag isn't available
in swag's CLI in any version, so the wrapper is the only way to enable
it.

Also upgrades swag from v1.16.2 to v1.16.6 (better generics support,
precise numeric formats, `x-enum-descriptions`, CVE-2024-45338 fix).
2026-02-06 13:04:35 +01:00
Dean Sheather bcc57632dd ci: split lint-actions into separate job to reduce flakes (#21834)
## Summary

The `lint/actions/zizmor` target flakes in CI due to network
connectivity issues when running on depot runners
(https://github.com/coder/internal/issues/1233). The zizmor tool needs
to reach GitHub's API but intermittently fails with "Connection refused"
errors.

## Changes

- Creates a new `lint-actions` CI job that only runs when `.github/**`
files are touched (using existing `ci` filter)
- Removes zizmor from the main `lint` job  
- Uses a Makefile conditional to include actionlint in `make lint`
locally but skip it in CI (where `lint-actions` handles it)

This reduces unnecessary flake exposure for PRs that don't modify GitHub
Actions files.

## Testing

- `actionlint` passes on the modified ci.yaml
- Verified Makefile conditional works: actionlint included locally,
skipped when `CI=true`

Fixes https://github.com/coder/internal/issues/1233
2026-02-03 00:32:09 +11:00
blinkagent[bot] d5296a4855 chore: add lint/migrations to detect hardcoded public schema (#21496)
## Problem

Migration 000401 introduced a hardcoded `public.` schema qualifier which
broke deployments using non-public schemas (see #21493). We need to
prevent this from happening again.

## Solution

Adds a new `lint/migrations` Make target that validates database
migrations do not hardcode the `public` schema qualifier. Migrations
should rely on `search_path` instead to support deployments using
non-public schemas.

## Changes

- Added `scripts/check_migrations_schema.sh` - a linter script that
checks for `public.` references in migration files (excluding test
fixtures)
- Added `lint/migrations` target to the Makefile
- Added `lint/migrations` to the main `lint` target so it runs in CI

## Testing

- Verified the linter **fails** on current `main` (which has the
hardcoded `public.` in migration 000401)
- Verified the linter **passes** after applying the fix from #21493

```bash
# On main (fails)
$ make lint/migrations
ERROR: Migrations must not hardcode the 'public' schema. Use unqualified table names instead.

# After fix (passes)
$ make lint/migrations
Migration schema references OK
```

## Depends on

- #21493 must be merged first (or this PR will fail CI until it is)

---------

Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Danny Kopping <danny@coder.com>
2026-01-15 14:17:16 +02:00
Danny Kopping 7d558e76e9 fix: make make test runnable again (#21251)
Signed-off-by: Danny Kopping <danny@coder.com>
2026-01-13 10:36:06 +00:00
Danny Kopping 40adf91cb0 feat: add profiling options to tests in Makefile (#21488)
Usage example:

```bash
$ make test TEST_CPUPROFILE=cpu.prof TEST_MEMPROFILE=mem.prof TEST_PACKAGES=./coderd
```

Note that `TEST_PACKAGES` has to be specified, otherwise you get `cannot
use -{cpu,memory}profile flag with multiple packages`.

Signed-off-by: Danny Kopping <danny@coder.com>
2026-01-13 10:53:09 +02:00
Cian Johnston b116d22c5f chore: manage tool versions in go.mod (#21455)
Go 1.24 adds [tool
dependencies](https://go.dev/doc/modules/managing-dependencies#tools).
This allows us to track versions of tools in our `go.mod` instead of
sprinkling various `go run` commands throughout our codebase.

NOTE: there are still various hard-coded `go install` commands in our
dogfood Dockerfile. As that list is likely severely outdated, will leave
that for a separate PR.
2026-01-08 16:25:28 +00:00
Spike Curtis 8ea9f587e8 chore: add import formatting to make fmt/go (#21453)
Modifies `make fmt/go` to also use https://github.com/daixiang0/gci to format our imports to a standard format.

It introduces a new shell script to do the formatting so that our formatting tools are in one place.
2026-01-08 15:36:03 +04:00
Sas Swart 2840fdcb54 feat(agent): add agent socket API (#20717)
relates to: https://github.com/coder/internal/issues/1094

This is number 2 of 5 pull requests in an effort to add agent script
ordering. It adds a drpc API that is exposed via a local socket. This
API serves access to a lightweight DAG based dependency manager that was
inspired by systemd.

In follow-up PRs:

* This unit manager will be plumbed into the workspace agent struct.
* CLI commands will use this agentsocket api to express dependencies
between coder scripts

I used an LLM to produce some of these changes, but I have conducted
thorough self review and consider this contribution to be ready for an
external reviewer.
2025-11-21 13:09:27 +02:00
Danny Kopping 2294c55bd9 chore: graduate aibridged* packages out of experimental (#20522)
<!--

If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.

-->
2025-10-29 07:00:24 -06:00
Spike Curtis af3ff825a1 test: track postgres database creation by package and test name (#20492)
Adds columns to track package and test name to test_databases table, and populates them as databases are created using the Broker.

In order to seamlessly work with existing `coder_database` databases with the old schema, the SQL that creates the table and columns is additive and idempotent, so we run it every time we initialize the Broker (once per test binary execution).

We include a transaction level advisorly lock to prevent deadlocks before attempting to alter the schema. I was seeing deadlocks without this.
2025-10-27 14:31:32 +04:00
Sas Swart 6c621364f8 feat: add a dependency management graph for agents (#20208)
Relates to https://github.com/coder/internal/issues/1093

This is the first of N pull requests to allow coder script ordering.
It introduces what is for now dead code, but paves the way for various
interfaces that allow coder scripts and other processes to depend on one
another via CLI commands and terraform configurations.

The next step is to add reactivity to the graph, such that changes in
the status of one vertex will propagate and allow other vertices to
change their own statuses.

Concurrency and stress testing yield the following:

CPU Profile:
<img width="1512" height="862" alt="Screenshot 2025-10-17 at 10 38 52"
src="https://github.com/user-attachments/assets/f46cf1a2-a0b2-4c02-81a0-069798108ee5"
/>

Mem Profile:
<img width="1512" height="862" alt="Screenshot 2025-10-17 at 10 38 01"
src="https://github.com/user-attachments/assets/45be1235-fff6-45ba-a50d-db9880377bd0"
/>

Predictably, lock contention and memory allocation are the largest
components of this system under stress. Nothing seems untoward.
2025-10-24 16:18:16 +02:00
Spike Curtis 2ded9d6a73 test: remove external compilation of db cleaner entirely (#20240)
fixes https://github.com/coder/internal/issues/1026

Thru a (perhaps too-) clever hack of `init()` functions, I've managed to remove the need to separately compile the cleaner binary. This should fix the flakes we are seeing were the binary compilation takes 10s of seconds on macOS.  The cleaner is encorporated directly into the test binary and we self-exec as the subprocess.
2025-10-10 10:02:03 +04:00
Spike Curtis 2bd11dfb15 test: precompile the db cleaner on make test (#20000)
We're seeing some timeouts from starting the db cleaner, e.g. https://github.com/coder/internal/issues/1026

My suspicion is that in CI the go build cache might not be warm, and so it can take a while to compile and run the dbcleaner subprocess.

This fix builds the cleaner once prior to starting a test run via `make test` to ensure we have a warm cache.
2025-09-30 10:42:06 +04:00
Thomas Kosiewski d0db9ec88f feat: add multi-scope support to API keys (#19917)
# Canonicalize API Key Scopes

This PR introduces canonical API key scopes with a `coder:` namespace prefix to avoid collisions with low-level resource:action names. It:

1. Renames special API key scopes in the database:
   - `all` → `coder:all`
   - `application_connect` → `coder:application_connect`

2. Adds support for a new `scopes` field in the API key creation request, allowing multiple scopes to be specified while maintaining backward compatibility with the singular `scope` field.

3. Updates the API documentation to reflect these changes, including the new endpoint for listing public API key scopes.

4. Ensures backward compatibility by mapping between legacy and canonical scope names in relevant code paths.
2025-09-26 11:56:34 +02:00
Thomas Kosiewski 4bda39585d feat: add external API key scopes (#19916)
# Add support for low-level API key scopes

This PR adds support for fine-grained API key scopes based on RBAC resource:action pairs. It includes:

1. A new endpoint `/api/v2/auth/scopes` to list all public low-level API key scopes
2. Generated constants in the SDK for all public scopes
3. Tests to verify scope validation during token creation
4. Updated API documentation to reflect the expanded scope options

The implementation allows users to create API keys with specific permissions like `workspace:read` or `template:use` instead of only the legacy `all` or `application_connect` scopes.



Fixes #19847
2025-09-26 11:43:32 +02:00
Danny Kopping fc9bff7107 feat: add aibridged package (#19797)
Addresses https://github.com/coder/internal/issues/987
2025-09-25 15:40:25 +02:00
Danny Kopping 615585d5d1 feat: add aibridgedserver pkg (#19902) 2025-09-25 13:32:16 +02:00
Thomas Kosiewski adb7521066 feat: generate RBAC scope name constants (#19896)
# Generate RBAC scope name constants

This PR adds a new generated file `coderd/rbac/scopes_constants_gen.go` that contains typed constants for all RBAC scope names in the format `Scope<Resource><Action>`. For example, `ScopeWorkspaceRead` for the scope "workspace:read".

These constants make it easier to reference specific scopes in code without using string literals, improving type safety and making refactoring easier.

The PR:
- Adds a new template file `scripts/typegen/scopenames.gotmpl`
- Updates the typegen script to support generating scope name constants
- Updates the Makefile to include the new generated file in build targets
2025-09-24 18:40:36 +02:00
Thomas Kosiewski acc0890dce feat: add lint check for API key scope enum completeness (#19862)
Added a script/linter to ensure all `policy.RBACPermissions` entries are part of the `api_key_scope` enumerated in the `coderd/database/dump.sql` file.

Fixes #19846
2025-09-24 18:06:16 +02:00
Danny Kopping 0ece153a0e chore: define aibridged protobuf specification (#19795) 2025-09-24 17:05:04 +10:00
Spike Curtis 72f58c0483 fix: limit test parallelism in make test (#19465)
In order to get `make test` to reliably pass again on our dogfood
workspaces, we're having to resort to setting parallelism.

It also reworks our CI to call the `make test` target, instead of
rolling a different command.

Behavior changes:
 * sets 8 packages x 8 tests in parallel by default on `make test`
* by default, removes the `-short` flag. In my testing it makes only a
few seconds difference on ~200s, or 1-2%
* by default, removes the `-count=1` flag that busts Go's test cache.
With a fresh cache and no code changes, `make test` executes in ~15
seconds.

Signed-off-by: Spike Curtis <spike@coder.com>
2025-08-21 16:37:31 +04:00
Dean Sheather 8d0bc485df chore: add actionlint and zizmor linters (#19459) 2025-08-21 22:14:43 +10:00
Danielle Maywood 5e84d257b7 refactor: convert workspacesdk.AgentConn to an interface (#19392)
Fixes https://github.com/coder/internal/issues/907

We convert `workspacesdk.AgentConn` to an interface and generate a mock
for it. This allows writing `coderd` tests that rely on the agent's HTTP
api to not have to set up an entire tailnet networking stack.
2025-08-20 10:00:44 +01:00
Ethan d7bdb3cdef ci: add paralleltestctx to lint/go (#19369)
Closes https://github.com/coder/internal/issues/884

We're adding this as a `go run` in `lint/go` for now, since adding it to
golangci-lint ourselves involves recompiling golangci-lint and then
running that new binary. I'll look into proposing it being added to the
public golangci-lint linters.

Doesn't appear to cause the lint ci job to take any longer, which is
nice.
2025-08-15 16:16:18 +10:00
Jakub Domeracki dc0919da33 feat: sign coder binaries with the release key using GPG (#18774)
### Description
This PR introduces GPG signing for all Coder *slim-binaries*.
Detached signatures will allow users to verify the integrity and
authenticity of the binaries they download.

### Changes
  * `scripts/sign_with_gpg.sh`: New script to sign a given binary
     using GPG. It imports the release key, signs the binary, and
     verifies the signature.
   * `scripts/build_go.sh`: Updated to call `sign_with_gpg.sh` when the
     `CODER_SIGN_GPG` environment variable is set to 1.
   * `.github/workflows/release.yaml`: The` CODER_SIGN_GPG` environment
     variable is now set to 1 during the release build, enabling GPG
     signing for all release binaries.
   * `.github/workflows/ci.yaml`: The `CODER_SIGN_GPG` environment
     variable is now set to 1 during the CI build, enabling GPG
     signing for all CI binaries.
* `Makefile`: Detached signatures are moved to the `/site/out/bin/
`directory
2025-07-09 11:53:27 +02:00
Hugo Dutka 3c2f3d640b chore: remove dbmem (#18803)
Remove the in-memory database. Addresses #15109.
2025-07-09 09:46:31 +02:00
blink-so[bot] 2c95a1dd71 chore: update gofumpt from v0.4.0 to v0.8.0 (#18652) 2025-07-03 11:28:00 -06:00
Thomas Kosiewski 4dcf0c3e7e docs: add comprehensive development documentation (#18646)
# Organize Development Documentation into Separate Files

This PR reorganizes the development documentation by splitting the monolithic CLAUDE.md file into multiple focused documents. The main file now provides a concise overview with essential commands and critical patterns, while importing detailed content from specialized guides.

Key improvements:
- Created separate documentation files for specific domains:
  - Database development patterns
  - OAuth2 implementation guidelines
  - Testing best practices
  - Troubleshooting common issues
  - Development workflows and guidelines
- Restructured the main CLAUDE.md to be more scannable with improved formatting
- Added quick-reference tables for common commands
- Maintained all existing content while making it more accessible
- Highlighted critical patterns that must be followed

This organization makes the documentation more maintainable and easier to navigate, allowing developers to quickly find relevant information for their specific tasks.
2025-07-03 18:51:23 +02:00
Dean Sheather ae3882a600 chore: move all images to new GCP project (#18324) 2025-06-11 13:06:31 +00:00
Hugo Dutka a7e828593f chore: retry failing tests in CI (#17681)
This PR introduces failing test retries in CI for e2e tests, Go tests
with the in-memory database, Go tests with Postgres, and the CLI tests.
Retries are not enabled for race tests.

The goal is to reduce how often flakes disrupt developers' workflows.
2025-05-06 16:53:26 +02:00
Danny Kopping b3cc8e56d2 chore: set timezone on all golden file make targets (#17533)
We replace timestamps in our golden files to keep the values constant.

However, if a non-UTC timezone is used then the timestamp will still be
replaced but the whitespace will be messed up (since it was aligned to
the original value).


![image](https://github.com/user-attachments/assets/b7ebf615-5b41-41bb-8939-682a45a61952)

Therefore we must force a timezone when generating golden files.

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-04-23 11:26:46 +00:00
Mathias Fredriksson 35553a5815 chore(Makefile): fix incorrect redirection of output (#17532) 2025-04-23 11:01:28 +00:00
Michael Suchacz 06d39151dc feat: extend request logs with auth & DB info (#17304)
Closes #16903
2025-04-15 13:27:23 +02:00
Michael Suchacz ce22de8d15 feat: log long-lived connections acceptance (#17219)
Closes #16904
2025-04-08 08:30:05 +00:00
Mathias Fredriksson 6bf22f8dc6 fix(Makefile): fix dcspec gen dependencies and hide error output (#17043) 2025-03-24 11:41:45 +00:00
Mathias Fredriksson 3b6bee9676 chore(Makefile): fix apidoc dependencies (#17042)
Our apidoc generation had some circular dependencies, this change splits
them up into separate Makefile targets.
2025-03-21 17:16:17 +02:00
Mathias Fredriksson f4b6f429c6 chore(Makefile): fix dependencies and timestamps (#17040)
This change should reduce "infinite" dependency cycles.

I added some unnecessary `touch`es for completeness.
2025-03-21 15:33:07 +02:00
Mathias Fredriksson b79167293c chore(Makefile): update golden files as part of make gen (#17039)
Updating golden files is an unnecessary extra step in addition to gen
that is easily overlooked, leading to the developer noticing the issue
in CI leading to lost developer time waiting for tests to complete.
2025-03-21 13:04:30 +00:00
Cian Johnston 13a3ddd964 fix(agent/agentcontainers): generate devcontainer metadata from schema (#16881)
Adds new dcspec package containing automatically generated devcontainer schema (using glideapps/quicktype).
2025-03-18 13:00:21 +00:00
Cian Johnston 09dd69a7e8 chore(dogfood): include multiple templates under dogfood/ (#16846)
* Renames `dogfood/contents` to `dogfood/coder`.
* Moves `coder-envbuilder` to `dogfood/coder-envbuilder`.
* Updates `dogfood/main.tf` to push `coder-envbuilder` template.
* Replaces hard-coded organization IDs with
`data.coderd_organization.default.id`.
2025-03-11 13:17:40 +00:00
ケイラ dfa33b11d9 chore: run make clean on workspace startup (#16660) 2025-02-24 10:43:03 -07:00
Sas Swart 71cbf735e5 feat(coderd): add support for presets to the coder API (#16526)
This pull request builds on the existing migrations and queries to add
support for presets to the coder API.
2025-02-12 14:41:14 +02:00
Cian Johnston 238b638591 ci: add missing files to gen/mark-fresh (#16504) 2025-02-10 12:45:44 +00:00
Cian Johnston 31b1ff7d3b feat(agent): add container list handler (#16346)
Fixes https://github.com/coder/coder/issues/16268

- Adds `/api/v2/workspaceagents/:id/containers` coderd endpoint that allows listing containers
visible to the agent. Optional filtering by labels is supported.
- Adds go tools to the `coder-dylib` CI step so we can generate mocks if needed
2025-02-10 11:29:30 +00:00
Thomas Kosiewski 1336925c9f feat(flake.nix): switch dogfood dev image to buildNixShellImage from dockerTools (#16223)
Replace Depot build action with Nix for Nix dogfood image builds

The dogfood Nix image is now built using Nix's native container tooling instead of Depot. This change:

- Adds Nix setup steps to the GitHub Actions workflow
- Removes the Dockerfile.nix in favor of a Nix-native container build
- Updates the flake.nix to support building Docker images
- Introduces a hash file to track Nix-related changes
- Updates the vendorHash for Go dependencies

Change-Id: I4e011fe3a19d9a1375fbfd5223c910e59d66a5d9
Signed-off-by: Thomas Kosiewski <tk@coder.com>
2025-01-28 16:38:37 +01:00
ケイラ 5f4ff58f84 fix: use pre-built binary instead of go run in e2e tests (#16236)
Using `go run` inside of a test is fragile, because it means we have to
wait for `go` to compile the binary while also constrained on resources
by the fact that Playwright and coderd are already running. We should
instead compile a coder binary for the current platform before the tests
and use it directly.
2025-01-23 09:45:50 -07:00
Danny Kopping 5b72a4376d chore: improve CI reliability (#16169)
We have an effort underway to replace `dbmem` (#15109), and consequently
we've begun running our full test-suite (with Postgres) on all supported
OSs - Windows, MacOS, and Linux, since #15520.

Since this change, we've seen a marked decrease in the success rate of
our builds on `main` (note how the Windows/MacOS failures account for
the vast majority of failed builds):


![image](https://github.com/user-attachments/assets/a02c15b7-037d-428a-a600-2aed60553ac0)

We're still investigating why these OSs are a lot less reliable. It's
likely that the VMs on which the builds are run have different
characteristics from our Ubuntu runners such as disk I/O, network
latency, or something else.

**In the meantime, we need to start trusting CI failures in `main`
again, as the current failures are too noisy / vague for us to
correct.**

We've also considered hosting our own runners where possible so we can
get OS-level observability to rule out some possibilities.

See the [meeting
notes](https://www.notion.so/coderhq/CI-Investigation-Call-Notes-17dd579be59280d8897cc9fe4bb46695?pvs=6&utm_content=17dd579b-e592-80d8-897c-c9fe4bb46695&utm_campaign=T1ZPT2FL0&n=slack&n=slack_link_unfurl)
where we linked into this for more detail.

This PR introduces several changes:

1. Moves the full test-suite with Postgres on Windows/MacOS to the
`nightly-gauntlet` workflow
tradeoff: this means that any regressions may be more difficult to
discover since we merge to main several times a day
2. Run only the CLI test-suite on each PR / merge to `main` on
Windows/MacOS
3. `test-go` is still running the full test-suite against all OSs
(including the CLI ones), but will soon be removed once #15109 is
completed since it uses `dbmem`
4. Changes `nightly-gauntlet` to run at 4AM: we've seen several
instances of the runner being stopped externally, and we're _guessing_
this may have something to do with the midnight UTC execution time, when
other cron jobs may run
5. Removes the existing `nightly-gauntlet` jobs since they haven't
passed in a long time, indicating that nobody cares enough to fix them
and they don't provide diagnostic value; we can restore them later if
necessary

I've manually run both these new workflows successfully:

- `ci`:
https://github.com/coder/coder/actions/runs/12825874176/job/35764724907
- `nightly-gauntlet`:
https://github.com/coder/coder/actions/runs/12825539092

---------

Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: Muhammad Atif Ali <atif@coder.com>
2025-01-20 07:06:33 +00:00
Mathias Fredriksson 08ffcb74c6 test(Makefile): retry pulling postgres in test-postgres-docker (#16178) 2025-01-17 15:22:50 +00:00
Mathias Fredriksson 7f46e3b1e0 test(Makefile): fix postgresql memory usage (#16170) 2025-01-17 16:07:11 +02:00