> [!WARNING] > The investigation and solution in this PR were done with [Mux](https://mux.coder.com/). I've reviewed the investigation methodology, evidence and solution, and it all appears sound. ## Summary PR #25570 (`refactor: move aibridged out of enterprise to AGPL`, merged 2026-05-22) added an in-memory aibridge DRPC server in `coderd/aibridged.go` that does `api.WebsocketWaitGroup.Add(1)` and only releases `Done()` when its client session is closed. PR #25575 then flipped `CODER_AI_GATEWAY_ENABLED` to default to `true`, so every `cli.Server()` invocation now spins up that goroutine. In `cli/server.go`, the only call to `aibridgeDaemon.Close()` was a `defer` scheduled at function return. During graceful shutdown the code first calls `coderAPICloser.Close()`, which waits on `api.WebsocketWaitGroup`. That wait sits for the full 10s timeout in `coderd/coderd.go` (`websocket shutdown timed out after 10 seconds`), then returns, then the function unwinds, and only then does the deferred `aibridgeDaemon.Close()` fire and let the goroutine call `Done()`. The 10s tax was previously latent (aibridged was enterprise-only and opt-in). After the two May 22 PRs it hit every `cli.Server()` test. On Linux/macOS CI it just makes the suite slower; on the Depot Windows runner, the ramdisk reservation leaves only ~17 GiB of headroom and the ~10s shutdown tails of multiple concurrent package binaries overlap into an OOM, presenting as `test-go-pg (windows-2022)` jobs that die silently at the ~600s watchdog with an empty `steps` array. See Slack: https://codercom.slack.com/archives/C05AE94121Z/p1779807717764189 ## Fix Close `aibridgeDaemon` explicitly during graceful shutdown, **before** `coderAPICloser.Close()` waits on the WebSocket wait group. This matches the existing ordered-shutdown pattern used for `tunnel` and `notificationsManager`. The deferred `aibridgeDaemon.Close()` is retained as a safety net for early-return paths, and is safe to double-call because `aibridged.Server.Close()` is already idempotent via `shutdownOnce` in `coderd/aibridged/aibridged.go`. ## Regression test `TestServer_AIGatewayShutdownOrdering` boots a real `coder server` with `--ai-gateway-enabled=true`, cancels its context, and asserts graceful shutdown finishes in under 8s. With the fix the test runs in ~0.1s; without the fix it fails deterministically at ~10.0s. The flag is passed explicitly so the test continues to guard the ordering even if the deployment default is ever flipped back. ## Evidence this fixes the OOM On Linux the patched `cli` test package drops from 114 s back to its pre-regression 30 s wall time at the same single-process peak RSS (~7.6 GiB), and the `websocket shutdown timed out after 10 seconds` log line disappears from every server-test run. Since the Windows OOM is the sum of multiple concurrent 10 s shutdown tails overlapping past the runner's ~17 GiB headroom, removing those tails returns the concurrent-RSS budget to its pre-regression level. The Windows OOM was intermittent (a handful of hits across many runs since May 22), so a single green `test-go-pg (windows-2022)` job on this PR is not by itself proof; confirmation will come from watching Windows runs on `main` over the next several days and seeing the ~600 s silent-kill fingerprint stop recurring. Relates to ENG-2771
Coder is a self-hosted platform for cloud development environments and AI coding agents. Workspaces are defined with Terraform, connected through a secure Wireguard® tunnel, and automatically shut down when not used. Coder Agents runs a native AI coding agent whose loop executes in the control plane on your infrastructure, with no API keys in workspaces.
- Define cloud development environments in Terraform
- EC2 VMs, Kubernetes Pods, Docker Containers, etc.
- Automatically shutdown idle resources to save on costs
- Onboard developers in seconds instead of days
- Delegate coding work to AI agents on your infrastructure
- Bring any model (Anthropic, OpenAI, Google, Bedrock, self-hosted)
- No LLM credentials in workspaces, user identity on every action
- Centralized model governance, cost tracking, and audit logging
Quickstart
The most convenient way to try Coder is to install it on your local machine and experiment with provisioning cloud development environments using Docker (works on Linux, macOS, and Windows).
# First, install Coder
curl -L https://coder.com/install.sh | sh
# Start the Coder server (caches data in ~/.cache/coder)
coder server
# Navigate to http://localhost:3000 to create your initial user,
# create a Docker template and provision a workspace
Install
The easiest way to install Coder is to use the
install script for Linux
and macOS. For Windows, use the latest ..._installer.exe file from GitHub
Releases.
curl -L https://coder.com/install.sh | sh
You can run the install script with --dry-run to see the commands that will be used to install without executing them. Run the install script with --help for additional flags.
See install for additional methods.
Once installed, you can start a production deployment with a single command:
# Automatically sets up an external access URL on *.try.coder.app
coder server
# Requires a PostgreSQL instance (version 13 or higher) and external access URL
coder server --postgres-url <url> --access-url <url>
Use coder --help to get a list of flags and environment variables. See the install guides for a complete tutorial.
Documentation
Browse the documentation or visit a specific section below:
- Workspaces: Workspaces contain the IDEs, dependencies, and configuration information needed for software development
- Templates: Templates are written in Terraform and describe the infrastructure for workspaces
- Coder Agents: Delegate coding work to AI agents running on your self-hosted infrastructure
- Administration: Learn how to operate Coder
- Premium: Learn about paid features built for large teams
- IDEs: Connect your existing editor to a workspace
Support
Feel free to open an issue if you have questions, run into bugs, or have a feature request.
Join our Discord to provide feedback on in-progress features and chat with the community using Coder!
Integrations
New integrations are always in progress. Open an issue to request one. Contributions are welcome in any official or community repository.
Official
- Coder Registry: Templates, modules, and integrations for common development environments
- VS Code Extension: Open any Coder workspace in VS Code with a single click
- JetBrains Toolbox Plugin: Open any Coder workspace from JetBrains Toolbox with a single click
- JetBrains Gateway Plugin: Open any Coder workspace in JetBrains Gateway with a single click
- Dev Containers: Build development environments using
devcontainer.jsonon Docker, Kubernetes, and OpenShift - Kubernetes Log Stream: Stream Kubernetes Pod events to the Coder startup logs
- Self-Hosted VS Code Extension Marketplace: A private extension marketplace that works in restricted or airgapped networks integrating with code-server.
- GitHub Actions: An action to set up the Coder CLI in GitHub workflows
Community
- Community Templates: Community-contributed workspace templates in the Coder Registry
- Community Modules: Community-contributed modules to extend Coder templates
- Provision Coder with Terraform: Provision Coder on Google GKE, Azure AKS, AWS EKS, DigitalOcean DOKS, IBMCloud K8s, OVHCloud K8s, and Scaleway K8s Kapsule with Terraform
- Coder Template GitHub Action: A GitHub Action that updates Coder templates
- Discord: Chat with the community and provide feedback on in-progress features
Contributing
New contributors are always welcome. If you are new to the Coder codebase, see the contribution guide to get started.
Hiring
Apply on the careers page if you are interested in joining the team.
