17 Commits

Author SHA1 Message Date
Kacper Sawicki df2360f56a feat(coderd): add consolidated /debug/profile endpoint for pprof collection (#22892)
## Summary

Adds a new `GET /api/v2/debug/profile` endpoint that collects multiple
pprof profiles in a single request and returns them as a tar.gz archive.
This allows collecting profiles (including block and mutex) without
requiring `CODER_PPROF_ENABLE` to be set, and without restarting
`coderd`.

Closes #21679

## What it does

The endpoint:
- Temporarily enables block and mutex profiling (normally disabled at
runtime)
- Runs CPU profile and/or trace for a configurable duration (default
10s, max 60s)
- Collects snapshot profiles (heap, allocs, block, mutex, goroutine,
threadcreate)
- Returns a tar.gz archive containing all requested `.prof` files
- Uses an atomic bool to prevent concurrent collections (returns 409
Conflict)
- Is protected by the existing debug endpoint RBAC (owner-only)

**Supported profile types:** cpu, heap, allocs, block, mutex, goroutine,
threadcreate, trace

**Query parameters:**
- `duration`: How long to run timed profiles (default: `10s`, max:
`60s`)
- `profiles`: Comma-separated list of profile types (default:
`cpu,heap,allocs,block,mutex,goroutine`)

## Additional changes

- **SDK client method** (`codersdk.Client.DebugCollectProfile`) for easy
programmatic access
- **`coder support bundle --pprof` integration**: tries the consolidated
endpoint first, falls back to individual `/debug/pprof/*` endpoints for
older servers
- **8 new tests** covering defaults, custom profiles, trace+CPU,
validation errors, authorization, and conflict detection
2026-03-13 14:09:39 +00:00
Cian Johnston 3a62a8e70e chore: improve healthcheck timeout message (#21520)
Relates to https://github.com/coder/internal/issues/272

This flake has been persisting for a while, and unfortunately there's no
detail on which healthcheck in particular is holding things up.

This PR adds a concurrency-safe `healthcheck.Progress` and wires it
through `healthcheck.Run`. If the healthcheck times out, it will provide
information on which healthchecks are completed / running, and how long
they took / are still taking.

🤖 Claude Opus 4.5 completed the first round of this implementation,
which I then refactored.
2026-01-15 16:37:05 +00:00
Spike Curtis bddb808b25 chore: arrange imports in a standard way (#21452)
Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:

```
import (
	"context"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"golang.org/x/xerrors"
	"gopkg.in/natefinch/lumberjack.v2"

	"cdr.dev/slog/v3"
	"github.com/coder/coder/v2/codersdk/agentsdk"
	"github.com/coder/serpent"
)
```

3 groups: standard library, 3rd partly libs, Coder libs.

This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
2026-01-08 15:24:11 +04:00
Spike Curtis 49b34a716a fix: fix slog to always use array of Fields (#21426)
Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder).

It also updates dependencies that also use slog and were updated.

I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule.

Other dependencies, I pushed new tags.
2026-01-08 10:29:41 +04:00
Cian Johnston a8fbe71a22 chore(cli): increase healthcheck timeout in TestSupportbundle (#17291)
Fixes https://github.com/coder/internal/issues/272

* Increases healthcheck timeout in tests. This seems to be the most
usual cause of test failures.
* Adds a non-nilness check before caching a healthcheck report.
* Modifies the HTTP response code to 503 (was 404) when no healthcheck
report is available. 503 seems to be a better indicator of the server
state in this case, whereas 404 could be misinterpreted as a typo in the
healthcheck URL.
2025-04-09 09:21:17 +01:00
Colin Adler 4d5a7b2d56 chore(codersdk): move all tailscale imports out of codersdk (#12735)
Currently, importing `codersdk` just to interact with the API requires
importing tailscale, which causes builds to fail unless manually using
our fork.
2024-03-26 12:44:31 -05:00
Cian Johnston 2cb9bfd517 refactor(coderd): move healthcheck report structs to codersdk (#12279)
Moves healthcheck report-related structs from coderd/healthcheck to codersdk
This prevents an import cycle when adding a codersdk.Client method to hit /api/v2/debug/health.
2024-02-23 13:13:28 +00:00
Cian Johnston 38ed816207 fix(coderd/debug): fix caching issue with dismissed sections (#11051) 2023-12-06 08:38:03 +00:00
Cian Johnston feaa9894a4 fix(site/src/api/typesGenerated): generate HealthSection enums (#11049)
Relates to #8971

- Introduces a codersdk.HealthSection enum type
- Refactors existing references using strings to use new HealthSection type
2023-12-05 20:00:27 +00:00
Marcin Tojek 19b6d194fc feat: manage health settings using Coder API (#10861) 2023-11-28 18:15:17 +01:00
Cian Johnston 9d310388e5 feat(coderd): /debug/health: add parameter to force healthcheck (#10677) 2023-11-15 15:54:15 +00:00
Cian Johnston b69c237b8a feat(coderd/healthcheck): allow configuring database hc threshold (#10623)
* feat(coderd/healthcheck): allow configuring database hc threshold
* feat(coderd): add database hc latency, plumb through
* feat(coderd): allow configuring healthcheck refresh interval
2023-11-13 14:14:43 +00:00
Colin Adler 89292264be feat(coderd): add simple healthcheck formatting option (#9864) 2023-09-25 22:55:50 +00:00
Kyle Carberry 22e781eced chore: add /v2 to import module path (#9072)
* chore: add /v2 to import module path

go mod requires semantic versioning with versions greater than 1.x

This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```

Migrate generated files to import /v2

* Fix gen
2023-08-18 18:55:43 +00:00
Colin Adler 2b63492649 feat(healthcheck): add failing sections to report (#7789) 2023-06-01 19:21:24 -05:00
Colin Adler 022372dd73 feat(healthcheck): add websocket report (#7689) 2023-05-30 14:22:32 -05:00
Colin Adler 7738274b3e feat(coderd): add DERP healthcheck (#6936) 2023-04-03 06:28:42 +00:00