Commit Graph

58 Commits

Author SHA1 Message Date
Ethan 4f1043a50a feat(scaletest): add chat scaletest command (#25553)
Adds `coder exp scaletest chat`, a harness for creating Coder Agents
chat load.
Start the mock LLM separately, prepare the scaletest workspaces you want
to target, then run the chat scaletest against the existing
`scaletest-*` fleet selected by the shared workspace targeting flags:

```sh
coder exp scaletest llm-mock --address 127.0.0.1:18080

coder exp scaletest chat --llm-mock-url http://127.0.0.1:18080/v1 --chats-per-workspace 10 --turns 1
coder exp scaletest chat --llm-mock-url http://127.0.0.1:18080/v1 --template docker --target-workspaces 0:10 --chats-per-workspace 1 --turns 10 --turn-start-delay 30s
```

This is the same pattern used by the `workspace-traffic` load generator.

Keeping the fake LLM as a separate process is intentional so it can be
scaled independently from the Coder deployment, which will likely be
necessary as we scale up and up.

This PR is the starting point: it provides the command, mock
provider/model bootstrap, existing workspace selection, chat streaming,
follow-up turns, metrics, and cleanup. Follow-up PRs will add multi-step
turns via tool calls. I'm still a bit iffy on the mechanism I have for
that. It'll likely involve having the runner send some magic strings
that the mock will recognise.


Relates to CODAGT-307
Relates to GRU-48
Relates to https://github.com/coder/scaletest/issues/124

Generated by Mux, but reviewed by a human
2026-05-26 14:19:36 +10:00
Callum Styan 191dd230ae feat: add agentfake scaletest subcommand (#25072)
This PR builds on top of https://github.com/coder/coder/pull/25070 to
add a way of running the larger "fake agent" manager via the existing
CLI, pulling in the URL/credentials already set.

With this, we can run a pod per scaletest region to act as all the
workspaces in that region.


This is in a new subcommand `scaletest agentfake` currently.

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2026-05-15 14:36:54 -07:00
Callum Styan 950660e392 fix(cli): extend exp scaletest cleanup to properly clean up prebuilds (#23628)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 10:28:27 -07:00
Jon Ayers 7e63fe68f7 fix: avoid instantiating a logger if provided /dev/null (#24027)
- Adds some additional context to workspace traffic logging
- Fails traffic tests if 0 bytes read from connection
2026-04-03 16:26:14 -05:00
Callum Styan 36665e17b2 feat: add WatchAllWorkspaceBuilds endpoint for autostart scaletests (#22057)
This PR adds a `WatchAllWorkspaces` function with `watch-all-workspaces`
endpoint, which can be used to listen on a single global pubsub channel
for _all_ workspace build updates, and makes use of it in the autostart
scaletest.

This negates the need to use a workspace watch pubsub channel _per_
workspace, which has auth overhead associated with each call. This is
especially relevant in situations such as the autostart scaletest, where
we need to start/stop a set of workspaces before we can configure their
autostart config. The overhead associated with all the watch requests
skews the scaletest results and makes it harder to reason about the
performance of the autostart feature itself.

The autostart scaletest also no longer generates its own metrics nor
does it wait for all the workspaces to actually start via autostart. We
should update the scaletest dashboard after both PRs are merged to
measure autostart performance via the new metrics.



The new function/endpoint and its usage in the autostart scaletest are
gated behind an experiment feature flag, this is something we should
discuss whether we want to enable the endpoint in prod by default or
not. If so, we can remove the experiment.

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Callum Styan <callum@coder.com>
2026-03-13 20:37:41 -07:00
Steven Masley b1e18f2398 fix: use dynamic parameter resolution in the cli (#21734)
Uses dynamic parameters EvaluateTemplateVersion vs TemplateVersionRichParameters to determine initial parameter state.

Closes https://github.com/coder/coder/issues/19879
2026-02-03 14:10:49 -06:00
Sas Swart 0ebe8e57ad chore: add scaletesting tools for aibridge (#21279)
This pull request adds scaletesting tools for aibridge.

See
https://www.notion.so/Scale-tests-2c5d579be5928088b565d15dd8bdea41?source=copy_link
for information and instructions.

closes: https://github.com/coder/internal/issues/1156
closes: https://github.com/coder/internal/issues/1155
closes: https://github.com/coder/internal/issues/1158
2026-01-15 17:05:46 +02:00
Spike Curtis bddb808b25 chore: arrange imports in a standard way (#21452)
Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:

```
import (
	"context"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"golang.org/x/xerrors"
	"gopkg.in/natefinch/lumberjack.v2"

	"cdr.dev/slog/v3"
	"github.com/coder/coder/v2/codersdk/agentsdk"
	"github.com/coder/serpent"
)
```

3 groups: standard library, 3rd partly libs, Coder libs.

This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
2026-01-08 15:24:11 +04:00
Spike Curtis 49b34a716a fix: fix slog to always use array of Fields (#21426)
Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder).

It also updates dependencies that also use slog and were updated.

I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule.

Other dependencies, I pushed new tags.
2026-01-08 10:29:41 +04:00
Spike Curtis 73253df6bf fix: use separate HTTP clients in scale test load generators (#21288)
While scale testing, I noticed that our load generators send basically
all requests to a single Coderd instance.

e.g. 


![image.png](https://app.graphite.com/user-attachments/assets/e259862a-adf1-47e7-a37b-fd14e420058e.png)


This is because our scale test commands create all `Runner`s using the
same codersdk Client, which means they share an underlying HTTP client.
With HTTP/2 a single TCP session can multiplex many different HTTP
requests (including websockets). So, it creates a single TCP connection
to a single coderd, and then sends all the requests down the one TCP
connections.

This PR modifies the `exp scaletest` load generator commands to create
an independent HTTP client per `Runner`. This means that each runner
will create its own TCP connection. This should help spread the load and
make a more realistic test, because in a real deployment, scaled out
load will be coming over different TCP connections.
2025-12-19 12:22:49 +04:00
Spike Curtis cac6d4ce98 feat: add --max-failures to coder exp scaletest create-workspaces (#21315)
Adds `--max-failures` flag to `coder exp scaletest create-workspaces` so that we can tolerate a few failures without failing the command.

When running our scale test infra, we create Kubernetes Jobs to create the initial cluster workspaces, then we have load-generation jobs that depend on them. At high scale, it's kind of expected that some of the requests will fail: even with 99.9% success, you still expect one failure per 1000. It's useful to be able to carry on with the scale test anyway and proceed to traffic generation.
2025-12-18 11:21:35 +04:00
Ethan bf40d678ec fix(cli): close prebuild runner prometheus server last (#21053)
## Description

Fixes the prebuilds scaletest command where the prometheus server was
being shut down before waiting for metrics to be scraped.

The issue was the defer order - since defers execute in LIFO (last-in,
first-out) order:

**Before (broken):**
1. Register tracing defer (includes wait for prometheus scrape)
2. Register prometheus server defer

Execution order: prometheus closes first, then wait happens (server
already gone!)

**After (fixed):**
1. Register prometheus server defer  
2. Register tracing defer (includes wait for prometheus scrape)

Execution order: wait happens first (server still up), then prometheus
closes.

This matches the pattern used in other scaletest commands.

## Impact

The `coderd_scaletest_prebuild_deletion_jobs_completed` metric (and
potentially others) was always showing 0 because the server shut down
before Prometheus could scrape the final values.

_This PR was generated by [`mux`](https://github.com/coder/mux) and
reviewed by a human._
2025-12-02 12:10:50 +00:00
Spike Curtis 5ea1353d46 feat: add exp scaletest task-status command (#20761)
Adds `coder exp scaletest task-status` subcommand to generate task status update load on the Coder server.
2025-11-19 10:13:32 +04:00
Ethan 6bafbb7bc5 feat(cli): add prebuilds scaletest command (#20600)
Closes https://github.com/coder/internal/issues/914
2025-11-14 18:08:14 +11:00
Spike Curtis dc277618ee chore: refactor flags that target workspaces in scaletest (#20537)
For the https://github.com/coder/internal/issues/913 we are going to be targeting running workspaces. So this PR modularizes the CLI flags and logic that select those targets so we can reuse it.
2025-10-30 11:10:24 +04:00
Kacper Sawicki 1230cacf78 feat(scaletest): extend notifications runner with smtp support (#20222)
This PR extends the scaletest notification runner with SMTP support.

If the `--smtp-api-url` flag is provided, the runner will also watch for SMTP notifications using the specified URL.

#### Changes
- Added a new watcher to retrieve emails sent to the runner user  
- Tracked WebSocket and SMTP latencies separately  
- Updated metrics to include `notification_id` and `notification_type` labels  

#### CLI Flags
- `--smtp-api-url`: Address of the SMTP mock HTTP API used to retrieve email notifications  

#### Metrics
- `notification_delivery_latency_seconds` now includes:
  - `notification_id`
  - `notification_type` (`websocket` or `smtp`)
2025-10-22 12:09:35 +02:00
Kacper Sawicki 7bbeef4999 feat(cli): add mock SMTP server for testing scaletest notifications (#20221)
This PR adds a fake SMTP server for scale testing. It collects emails sent during tests, which you can then check using the HTTP API.

#### Changes
- Added mock SMTP server  
- Added `coder scaletest smtp` CLI command  
- Implemented HTTP API endpoints to retrieve messages by email  
- Added auto-purge to prevent memory issues  

#### HTTP API Endpoints
- `GET /messages?email=<email>` – Get messages sent to an email address  
- `POST /purge` – Clear all messages from memory  

The HTTP API parses raw email messages to extract the **date**, **subject**, and **notification ID**.

Notification IDs are sent in emails like this:
```html
<p>
  <a href="http://127.0.0.1:3000/settings/notifications?disabled=4e19c0ac-94e1-4532-9515-d1801aa283b2"
     style="color: #2563eb; text-decoration: none;">
    Stop receiving emails like this
  </a>
</p>
```

#### CLI
```bash
coder scaletest smtp --host localhost --port 33199 --api-port 8080 --purge-at-count 1000
```

**Flags:**
- `--host`: Host for the mock SMTP and API server (default: localhost)  
- `--port`: Port for the mock SMTP server (random if not specified)  
- `--api-port`: Port for the HTTP API server (random if not specified)  
- `--purge-at-count`: Max number of messages before auto-purging (default: 100000)
2025-10-22 11:14:49 +02:00
Spike Curtis 65335bc7d4 feat: add cli command scaletest dynamic-parameters (#20034)
part of https://github.com/coder/internal/issues/912

Adds CLI command `coder exp scaletest dynamic-parameters`

I've left out the configuration of tracing and timeouts for now. I think I want to do some refactoring of the scaletest CLI to make handling those flags take up less boiler plate.

I will add tracing and timeout flags in a follow up PR.
2025-10-07 21:53:59 +04:00
Kacper Sawicki 0c2eca94f5 feat(cli): add notifications scaletest command (#20092)
Closes https://github.com/coder/internal/issues/984
2025-10-07 10:33:52 +02:00
Ethan 2b4485575c feat(scaletest): add autostart scaletest command (#20161)
Closes https://github.com/coder/internal/issues/911
2025-10-03 20:50:21 +10:00
Ethan 0be922170b feat(cli): add workspace-updates scaletest command (#19905)
Closes https://github.com/coder/internal/issues/889

```
$ coder exp scaletest workspace-updates --workspace-count=4 --power-user-workspaces=2 --template="scratch"
Distribution plan:
  Total workspaces: 4
  Power users: 1 (each owning 2 workspaces = 2 total)
  Regular users: 2 (each owning 1 workspace = 2 total)
Planning workspace...
=== ✔ Queued [0ms]
==> ⧗ Running
==> ⧗ Running
=== ✔ Running [5ms]
==> ⧗ Setting up
=== ✔ Setting up [54ms]
==> ⧗ Detecting persistent resources
=== ✔ Detecting persistent resources [2909ms]
==> ⧗ Cleaning Up
=== ✔ Cleaning Up [5ms]
┌───────────────────────────────────────────┐
│ Workspace Preview                         │
├───────────────────────────────────────────┤
│ RESOURCE                 ACCESS           │
├───────────────────────────────────────────┤
│ null_resource.workspace                   │
│ └─ main (linux, amd64)    coder ssh       │
└───────────────────────────────────────────┘
Creating users...
Running workspace updates scaletest...


Test results:
        Pass:  3
        Fail:  0
        Total: 3

        Total duration: 5.378685938s
        Avg. duration:  3.701307642s

Cleaning up...

Uploading traces...
Waiting 15s for prometheus metrics to be scraped
```
2025-09-30 18:07:05 +10:00
Spike Curtis a30c30724b chore: refactor cli and coderd to use ClientOptions (#19763)
Refactors CLI and coderd to use the ClientBuilder pattern rather than directly instantiating the Client.
2025-09-22 21:02:56 +04:00
Spike Curtis 606ae897b7 chore: refactor to directly create Client in Command Handlers (#19760)
Refactors the CLI to create the `*codersdk.Client` in the handlers. This is groundwork for changing the `rootCmd.InitClient()` to use the new `ClientOption`​s.

It also improves variable locality, scoping the Client to the handler. This makes misuse less likely and reduces the memory allocations to just the command being executed, rather than allocating a Client for every command regardless of whether it is executed.
2025-09-22 17:14:07 +04:00
Ethan 6d9e29beb1 refactor(scaletest): generate user and workspace names if omitted (#19885)
Relates to https://github.com/coder/internal/issues/985.

Some scaletest runners would autogenerate names if they weren't supplied on the config, while others required a name be supplied, and a name was autogenerated in the CLI command handler. This PR unifies the runners to make names and emails optional on each config, and generate them in the scaletest runner if omitted. 

The create user runner in the PR above in the stack will do this too.
2025-09-22 17:25:47 +10:00
Ethan f721f3d9d7 chore: add --disable-direct to coder exp scaletest workspace-traffic --ssh (#19632)
Relates to https://github.com/coder/internal/issues/888

As part of our renewed connection scaletesting efforts, we want to
scaletest coder in a scenario where direct connections aren't available
(relatively common for our customers), and all concurrent connections
are relayed via DERP.

This PR adds a flag, `--disable-direct` that can be included on the
existing`coder exp scaletest workspace-traffic -ssh` to disable direct
connections.
2025-08-29 17:02:13 +10:00
Bruno Quaresma d779126ee3 chore: rollback PR #18081 (#18104)
Rollback https://github.com/coder/coder/pull/18081
2025-05-29 13:12:13 -03:00
Bruno Quaresma 2ec7404197 chore: make owner_name and owner_username consistent (#18081)
We've been using owner_name inconsistently as username. So this PR fixes
it by making the attribute naming more consistent.
2025-05-28 17:25:32 -03:00
Eng Zer Jun 04c33968cf refactor: replace golang.org/x/exp/slices with slices (#16772)
The experimental functions in `golang.org/x/exp/slices` are now
available in the standard library since Go 1.21.

Reference: https://go.dev/doc/go1.21#slices

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2025-03-04 00:46:49 +11:00
Garrett Delfosse 50333d312f feat: add workspace-proxy-url flag to scaletest workspace-traffic (#15920)
This allows the command to target a workspace proxy when appropriate. 

Part of https://github.com/coder/internal/issues/149

So far I've verified the correct url being used with logs:
```
➜  coder git:(f0ssel/workspace-traffic-proxy) ✗ go run ./cmd/coder/main.go exp scaletest workspace-traffic --job-timeout=60s --workspace-proxy-url="https://paris.fly.dev.coder.com"
Running load test...

web url: https://paris.fly.dev.coder.com

...

➜  coder git:(f0ssel/workspace-traffic-proxy) ✗ go run ./cmd/coder/main.go exp scaletest workspace-traffic --job-timeout=60s --workspace-proxy-url="https://paris.fly.dev.coder.com" --app="code-server"
Running load test...

app url: https://paris.fly.dev.coder.com/@f0ssel/scaletest-1.dev/apps/code-server
```
2024-12-18 11:53:01 -05:00
Cian Johnston 6914862903 fix(cli): add check for DisableOwnerWorkspaceExec in scaletest (#14417)
- Adds `--use-host-login` to `coder exp scaletest workspace-traffic`
- Modifies getScaletestWorkspaces to conditionally filter workspaces if `CODER_DISABLE_OWNER_WORKSPACE_ACCESS` is set
- Adds a warning if `CODER_DISABLE_OWNER_WORKSPACE_ACCESS` is set and scaletest workspaces are filtered out due to ownership mismatch.
- Modifies `coderdtest.New` to detect cross-test bleed of `CODER_DISABLE_OWNER_WORKSPACE_ACCESS` and fast-fail.
2024-08-26 12:02:54 +01:00
Cian Johnston 365231b1e5 fix(cli): scaletest: ignore errors syncing output (#13076) 2024-04-26 09:18:33 +01:00
Colin Adler 4d5a7b2d56 chore(codersdk): move all tailscale imports out of codersdk (#12735)
Currently, importing `codersdk` just to interact with the API requires
importing tailscale, which causes builds to fail unless manually using
our fork.
2024-03-26 12:44:31 -05:00
Ammar Bandukwala b4c0fa80d8 chore(cli): rename Cmd to Command (#12616)
I think Command is cleaner and my original decision to use "Cmd"
a mistake.

Plus this creates better parity with cobra.
2024-03-17 09:45:26 -05:00
Ammar Bandukwala 496232446d chore(cli): replace clibase with external coder/serpent (#12252) 2024-03-15 11:24:38 -05:00
Kyle Carberry 895df54051 fix: separate signals for passive, active, and forced shutdown (#12358)
* fix: separate signals for passive, active, and forced shutdown

`SIGTERM`: Passive shutdown stopping provisioner daemons from accepting new
jobs but waiting for existing jobs to successfully complete.

`SIGINT` (old existing behavior): Notify provisioner daemons to cancel in-flight jobs, wait 5s for jobs to be exited, then force quit.

`SIGKILL`: Untouched from before, will force-quit.

* Revert dramatic signal changes

* Rename

* Fix shutdown behavior for provisioner daemons

* Add test for graceful shutdown
2024-03-15 13:16:36 +00:00
Cian Johnston 96c9838ce3 fix(cli): scaletest: do not screenshot if verbose=false (#12317) 2024-02-27 12:35:48 +00:00
Mathias Fredriksson 02124758fb feat(cli/exp): extend scaletest create-workspaces with --retry option (#11825)
Part of #11801
2024-01-26 11:29:48 +00:00
Mathias Fredriksson 593a1e9f60 feat(cli/exp): add target workspace/users to scaletest commands (#11701) 2024-01-19 15:32:46 +00:00
Mathias Fredriksson 73e6bbff7e feat(cli/exp): add app testing to scaletest workspace-traffic (#11633) 2024-01-19 15:20:19 +02:00
Dean Sheather 695f57f7ff fix: use header flags in wsproxy server (#10985) 2023-12-05 14:13:42 +04:00
Mathias Fredriksson 99151183bc feat(scaletest): replace bash with dd in ssh/rpty traffic and use pseudorandomness (#10821)
Fixes #10795
Refs #8556
2023-11-30 19:30:12 +02:00
Spike Curtis 4894eda711 feat: capture cli logs in tests (#10669)
Adds a Logger to cli Invocation and standardizes CLI commands to use it.  clitest creates a test logger by default so that CLI command logs are captured in the test logs.

CLI commands that do their own log configuration are modified to add sinks to the existing logger, rather than create a new one.  This ensures we still capture logs in CLI tests.
2023-11-14 22:56:27 +04:00
Mathias Fredriksson 43a867441a feat(cli): add template filter support to exp scaletest cleanup and traffic (#10558) 2023-11-07 16:41:55 +00:00
Jon Ayers 2dce4151ba feat: add cli support for workspace automatic updates (#10438) 2023-11-02 14:41:34 -05:00
Cian Johnston dd86100f33 fix(scaletest): fix flake in Test_Runner/Cleanup (#10252)
* fix(scaletest/createworkspaces): address flake in Test_Runner/CleanupPendingBuild

* fix(scaletest): pass io.Writer to Cleanup()

* add some extra logs to workspacebuild cleanup

* fixup! fix(scaletest): pass io.Writer to Cleanup()

* remove race

* fmt

* address PR comments
2023-10-16 12:37:12 +01:00
Cian Johnston 1e75762cb4 fix(cli): scaletest: create-worksapces: remove invalid character for kubernetes provider in implicit plan (#10228) 2023-10-12 09:21:40 +01:00
Cian Johnston e829cbf2db fix(scaletest/dashboard): fix early exit due to validate (#10212) 2023-10-11 11:51:06 +00:00
Cian Johnston b3471bd23a fix(scaletest/dashboard): increase viewport size and handle deadlines (#10197)
- Set viewport size to avoid responsive mode
- Added way more debug logging
- Added facility to write a screenshot on error in verbose mode.
- Added a deadline for each iteraction of clicking on and waiting for a thing.
2023-10-11 11:10:08 +01:00
Cian Johnston 5673aca408 feat(cli): add --parameter flag to exp scaletest command (#10132) 2023-10-09 14:08:24 +01:00
Cian Johnston 9aac15212b fix(cli): remove exp scaletest from slim binary (#9934)
- Removes the `exp scaletest` command from the slim binary 
- Updates scaletest-runner template to fetch the full binary from the running Coder instance
2023-10-03 15:13:04 +01:00