Commit Graph

22 Commits

Author SHA1 Message Date
Cian Johnston 06c50d13ad fix(cli): exorcise the DERP healthcheck demon from TestSupportBundle (#23337)
- Replace real healthcheck with mock `HealthcheckFunc` that returns a
canned report instantly
- Remove healthcheck cache-seeding goroutine/channel workaround
- Remove `HealthcheckTimeout: testutil.WaitSuperLong` (no longer needed)
- Reduce `setupCtx` from `WaitSuperLong` (60s) to `WaitLong` (25s)

The DERP healthcheck performs real network operations (portmapper
gateway probing, STUN) that hang for 60s+ on macOS CI runners. Since
`TestSupportBundle` validates bundle generation, not healthcheck
correctness, a canned report eliminates this entire class of flake.

Fixes coder/internal#272

> 🤖 This PR was created with the help of Coder Agents, and was reviewed
by my human. 🧑‍💻
2026-03-20 09:46:13 +00:00
Cian Johnston fe82d0aeb9 fix: allow member users to generate support bundles (#23040)
Fixes AIGOV-141

The `coder support bundle` command previously required admin permissions
(`Read DeploymentConfig`) and would abort entirely for non-admin
`member` users with:

```
failed authorization check: cannot Read DeploymentValues
```

This change makes the command **degrade gracefully** instead of failing
outright.

<details>
<summary>
Changes
</summary>

### `support/support.go`
- **`Run()`**: The authorization check for `Read DeploymentValues` is
now a soft warning instead of a hard gate. Unauthenticated users (401)
still fail, but authenticated users with insufficient permissions
proceed with reduced data.
- **`DeploymentInfo()`**: `DeploymentConfig` and `DebugHealth` fetches
now handle 403/401 responses gracefully, matching the existing pattern
used by `DeploymentStats`, `Entitlements`, and `HealthSettings`.
- **`NetworkInfo()`**: Coordinator debug and tailnet debug fetches now
check response status codes for 403/401 before reading the body.

### `cli/support.go`
- **`summarizeBundle()`**: No longer returns early when `Config` or
`HealthReport` is nil. Instead prints warnings and continues summarizing
available data (e.g., netcheck).

### Tests
- `MissingPrivilege` → `MemberNoWorkspace`: Asserts member users can
generate a bundle successfully with degraded admin-only data.
- `NoPrivilege` → `MemberCanGenerateBundle`: Asserts the CLI produces a
valid zip bundle for member users.
- All existing tests continue to pass (`NoAuth`, `OK`, `OK_NoWorkspace`,
`DontPanic`, etc.).

## Behavior matrix

| User type | Before | After |
|---|---|---|
| **Admin** | Full bundle | Full bundle (no change) |
| **Member** | Hard error | Bundle with degraded admin-only data |
| **Unauthenticated** | Hard error | Hard error (no change) |

Related to PRODUCT-182
2026-03-18 13:43:10 +00:00
Rowan Smith b163b4c950 feat: support bundle updates to enable pprof and telemetry collection (#21486)
- Adds pprof collection support now that we have the listeners
automatically starting (requires Coder server 2.28.0+, includes a
version check). Collects heap, allocs, profile (30s), block, mutex,
goroutine, threadcreate, trace (30s), cmdline, symbol. Performs capture
for 30 seconds and emits a log line stating as such. Enable capture by
supplying the `--pprof` flag or `CODER_SUPPORT_BUNDLE_PPROF` env var.
Collection of pprof data from both coderd and the Coder agent occurs.
- Adds collection of Prometheus metrics, also requires 2.28.0+
- Adds the ability to include a template in the bundle independently of
supplying the details of a running workspace by supplying the
`--template` flag or `CODER_SUPPORT_BUNDLE_TEMPLATE` env var
- Captures a list of workspaces the user has access to. Defaults to a
max of 10, configurable via `--workspaces-total-cap` /
`CODER_SUPPORT_BUNDLE_WORKSPACES_TOTAL_CAP`
- Collects additional stats from the coderd deployment (aggregated
workspace/session metrics), as well as entitlements via license and
dismissed health checks.

created with help from mux
2026-01-20 10:28:52 +11:00
Spike Curtis bddb808b25 chore: arrange imports in a standard way (#21452)
Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:

```
import (
	"context"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"golang.org/x/xerrors"
	"gopkg.in/natefinch/lumberjack.v2"

	"cdr.dev/slog/v3"
	"github.com/coder/coder/v2/codersdk/agentsdk"
	"github.com/coder/serpent"
)
```

3 groups: standard library, 3rd partly libs, Coder libs.

This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
2026-01-08 15:24:11 +04:00
Cian Johnston 6bd2d1c85f chore(cli): seed healthcheck cache in TestSupportBundle (#21436)
Fixes https://github.com/coder/internal/issues/272

This test periodically fails due to the healthcheck timing out.
The problem is compounded due to the fact that we stand up a new
coderdtest instance for each test.

This PR does the following:
* Updates the subtests to share a single `coderdtest` instance.
* Hits the `/debug/health` endpoint before completing the setup phase so
that the result is cached.

This will not completely remove the issue, as the healthcheck could
still fail due to test-infrastructure-related issues. In this case we
may decide to add a retry in this 'seed' function.
2026-01-07 08:47:31 +00:00
Kacper Sawicki 7c40f86a6a feat(cli): include license status in support bundle (#18472)
Closes #18207

This PR adds license status to support bundle to help with
troubleshooting license-related issues.

- `license-status.txt`, is added to the support bundle.
    - it contains the same output as the `coder license list` command.
- license output formatter logic has been extracted into a separate
function.
- this allows it to be reused both in the `coder license list` cmd and
in the support bundle generation.
2025-06-24 11:16:31 +02:00
Cian Johnston a8fbe71a22 chore(cli): increase healthcheck timeout in TestSupportbundle (#17291)
Fixes https://github.com/coder/internal/issues/272

* Increases healthcheck timeout in tests. This seems to be the most
usual cause of test failures.
* Adds a non-nilness check before caching a healthcheck report.
* Modifies the HTTP response code to 503 (was 404) when no healthcheck
report is available. 503 seems to be a better indicator of the server
state in this case, whereas 404 could be misinterpreted as a typo in the
healthcheck URL.
2025-04-09 09:21:17 +01:00
Mathias Fredriksson 860d17ad09 test(cli): fix context init in TestSupportBundle (#16174) 2025-01-17 14:51:38 +02:00
Steven Masley 343f8ec9ab chore: join owner, template, and org in new workspace view (#15116)
Joins in fields like `username`, `avatar_url`, `organization_name`,
`template_name` to `workspaces` via a **view**. 
The view must be maintained moving forward, but this prevents needing to
add RBAC permissions to fetch related workspace fields.
2024-10-22 09:20:54 -05:00
Cian Johnston 82e6070c7a fix(cli): ensure that the support bundle command does not panic on zero values (#14392)
We try to write a cute little summary at the end of the bundle, but that could panic if some of the fields of the bundle were nil. Adds a test that essentially ensures nil values in a bundle, and ensures that it can be handled without losing our towels.

Co-authored-by: Danny Kopping <danny@coder.com>
2024-08-22 11:15:02 +01:00
Spike Curtis 4b0b9b08d5 feat: add interfaces report to support bundle (#13563) 2024-06-13 13:09:54 +04:00
Cian Johnston b163bc7f01 fix(support): correctly rename existing agent connection info, add real netcheck (#12946) 2024-04-12 09:40:04 +01:00
Cian Johnston fad97a14f9 fix(cli): allow generating partial support bundles with no workspace or agent (#12933)
* fix(cli): allow generating partial support bundles with no workspace or agent

* nolint control flag
2024-04-11 10:09:10 +01:00
Colin Adler 4d5a7b2d56 chore(codersdk): move all tailscale imports out of codersdk (#12735)
Currently, importing `codersdk` just to interact with the API requires
importing tailscale, which causes builds to fail unless manually using
our fork.
2024-03-26 12:44:31 -05:00
Cian Johnston 01f9a9ab77 feat(cli): unhide support bundle cmd (#12745)
* chore(cli): add another test to ensure no secret leakage

* feat(cli): unhide support bundle cmd
2024-03-25 15:14:27 +00:00
Cian Johnston 28730ca3d8 fix(support): sanitize manifest (#12711) 2024-03-21 19:55:34 +00:00
Cian Johnston f2a9e515df feat(cli/support): confirm before creating bundle (#12684)
Forces user to confirm before creating a support bundle.
Also adds contextual information to the bundle under cli_logs.txt.
2024-03-21 17:06:28 +00:00
Cian Johnston b0c4e7504c feat(support): add client magicsock and agent prometheus metrics to support bundle (#12604)
* feat(codersdk): add ability to fetch prometheus metrics directly from agent
* feat(support): add client magicsock and agent prometheus metrics to support bundle
* refactor(support): simplify AgentInfo control flow

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2024-03-15 15:33:49 +00:00
Cian Johnston 2abc1cd2b7 feat(support): fetch agent network info over tailnet (#12577)
Adds agent-related information to support bundle command.
2024-03-15 11:01:39 +00:00
Cian Johnston 135381bb4e chore(cli): skip support bundle test on windows (#12596) 2024-03-14 15:25:09 +00:00
Cian Johnston 17caf58b5e feat(support): add template info to support bundle (#12451)
Adds workspace build parameters, template, template version, and zipped template source to support bundle.
2024-03-07 14:43:46 +00:00
Cian Johnston b1c2fea78b feat(cli): add support cmd (#12328)
Part of #12163

- Adds a command coder support bundle <workspace> that generates a 
  support bundle and writes it to coder-support-$(date +%s).zip.
- Note: this is hidden currently until the rest of the functionality is fleshed out.
2024-03-01 17:13:50 +00:00