Commit Graph

102 Commits

Author SHA1 Message Date
Spike Curtis 56eb57caf4 chore: enable agent socket by default (#22352)
relates to #21335

Enables the agent socket by default and updates docs to strike references to having to enable it.

The PRs in this stack change the MCP server that Tasks use to update their status to rely on the agent socket, rather than directly dialing Coderd with the agent token.

Default disable was a reasonable default when it was only used for the experimental script ordering features, but now that we want to use it for Tasks, it should be default on.
2026-03-03 21:23:59 +04:00
Jon Ayers 11e17b3de9 chore: log the OS signal prior to exiting in agent (#21941)
Adds additional logs for determining what signal the agent receives
prior to shut down. Also helps distinguish whether the signal originated
at the agent or reaper.
2026-02-05 12:32:07 -06:00
Jon Ayers 4f1fd82ed7 fix: propagate correct agent exit code (#21718)
The reaper (PID 1) now returns the child's exit code instead of always
exiting 0. Signal termination uses the standard Unix convention of 128 +
signal number.

fixes #21661
2026-01-28 15:56:04 -06:00
Spike Curtis bddb808b25 chore: arrange imports in a standard way (#21452)
Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:

```
import (
	"context"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"golang.org/x/xerrors"
	"gopkg.in/natefinch/lumberjack.v2"

	"cdr.dev/slog/v3"
	"github.com/coder/coder/v2/codersdk/agentsdk"
	"github.com/coder/serpent"
)
```

3 groups: standard library, 3rd partly libs, Coder libs.

This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
2026-01-08 15:24:11 +04:00
Spike Curtis 49b34a716a fix: fix slog to always use array of Fields (#21426)
Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder).

It also updates dependencies that also use slog and were updated.

I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule.

Other dependencies, I pushed new tags.
2026-01-08 10:29:41 +04:00
Zach 07924037e7 feat: add boundary log forwarding from agent to coderd (#21345)
Add agent forwarding of boundary audit logs from workspaces to coderd
via agent API, and re-emission of boundary logs to coderd stderr. This
change adds a server to the workspace agent that always listens on a
unix socket for boundary to connect and send audit logs.

coderd log format example:
```
[API] 2025-12-23 18:31:46.755 [info] coderd.agentrpc: boundary_request owner=.. workspace_name=.. agent_name=.. decision=.. workspace_id=.. http_method=.. http_url=.. event_time=.. request_id=..
```

Corresponding boundary PR: https://github.com/coder/boundary/pull/124
RFC:
https://www.notion.so/coderhq/Agent-Boundary-Logs-2afd579be59280f29629fc9823ac41ba
https://github.com/coder/coder/issues/21280
2025-12-31 16:38:19 -07:00
Sas Swart ce627bf23f feat: implement agent socket api, client and cli (#20758)
closes: https://github.com/coder/coder/issues/10352
closes: https://github.com/coder/internal/issues/1094
closes: https://github.com/coder/internal/issues/1095

In this pull request, we enable a new set of experimental cli commands
grouped under `coder exp sync`.
These commands allow any process acting within a coder workspace to
inform the coder agent of its requirements and execution progress. The
coder agent will then relay this information to other processes that
have subscribed.

These commands are:
```
# Check if this feature is enabled in your environment 
coder exp sync ping

# express that your unit depends on another
coder exp sync want <unit> <dependency_unit> 

# express that your unit intends to start a portion of the script that requires 
# other units to have completed first. This command blocks until all dependencies have been met
coder exp sync start <unit> 

# express that your unit has completes its work, allowing dependent units to begin their execution
coder exp sync complete <unit>
```

Example:

In order to automatically run claude code in a new workspace, it must
first have a git repository cloned. The scripts responsible for cloning
the repository and for running claude code would coordinate in the
following way:

```bash
# Script A: Claude code

# Inform the agent that the claude script wants the git script.
# That is, the git script must have completed before the claude script can begin its execution
coder exp sync want claude git

# Inform the agent that we would now like to begin execution of claude.
# This command will block until the git script (and any other defined dependencies)
# have completed
coder exp sync start claude

# Now we run claude code and any other commands we need
claude ...

# Once our script has completed, we inform the agent, so that any scripts that depend on this one
# may begin their execution

coder exp sync complete claude
```

```bash
# Script B: Git

# Because the git script does not have any dependencies, we can simply inform the agent that we 
# intend to start
coder exp sync start git

git clone ssh://git@github.com/coder/coder

# Once the repository have been cloned, we inform the agent that this script is complete, so that
# scripts that depend on it may begin their execution.
coder exp sync complete git
```

Notes:
* Unit names (ie. `claude` and `git`) given as input to the sync
commands are arbitrary strings. You do not have to conform to specific
identifiers. We recommend naming your scripts descriptively, but
succinctly.
* Scripts unit names should be well documented. Other scripts will need
to know the names you've chosen in order to depend on yours. Therefore,
you

---------

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2025-11-28 08:33:50 +02:00
Mathias Fredriksson 46b2f3df8e fix(cli): allow disabling debug listening ports for agent (#20671)
A customer reported unexpected port allocation in their workspace. When
looking into it I noticed we always hijack these ports and there is no 
way to disable them entirely.

This change allows the servers to be disabled by setting them to the
empty string. Previously they would still listen on ephemeral ports.

```console
❯ coder agent --help | grep -E '211[2-3]|6060'
      --debug-address string, $CODER_AGENT_DEBUG_ADDRESS (default: 127.0.0.1:2113)
      --pprof-address string, $CODER_AGENT_PPROF_ADDRESS (default: 127.0.0.1:6060)
      --prometheus-address string, $CODER_AGENT_PROMETHEUS_ADDRESS (default: 127.0.0.1:2112)
```

There are now two ways to disable, either via CLI or env variables:

```console
# Flags.
coder agent --debug-address= --pprof-address= --prometheus-address=

# Environment variables.
export CODER_AGENT_DEBUG_ADDRESS=
export CODER_AGENT_PPROF_ADDRESS=
export CODER_AGENT_PROMETHEUS_ADDRESS=
coder agent
```
2025-11-05 14:22:24 +02:00
Spike Curtis 18945a7949 chore: refactor CLI agent auth tests as unit tests (#19609)
Fixes https://github.com/coder/internal/issues/933

Refactors CLI tests that check the `--auth` flag parsing for various public clouds into a unit test that just creates the agent Client and asserts on the type.

Testing that the agent client actually authenticates correctly with these auth types is well covered by Coderd tests, so we don't need to retread that ground here, and the deleted tests were flaky on Windows.
2025-09-03 10:49:19 +04:00
Spike Curtis 1354d84eb4 chore: refactor instance identity to be a SessionTokenProvider (#19566)
Refactors Agent instance identity to be a SessionTokenProvider.

Refactors the CLI to create Agent clients via a centralized function, rather than add-hoc via individual command handlers and their flags.

This allows commands besides `coder agent`, but which still use the agent identity, to support instance identity authentication.

Fixes #19111 by unifying all API requests to go thru the SessionTokenProvider for auth credentials.
2025-09-03 10:38:42 +04:00
Danielle Maywood b8e2344ef5 chore(agent/agentcontainers): disable project autostart by default (#19114)
We disable the logic that allows autostarting discovered devcontainers
by default. We want this behavior to be opt-in rather than opt-out.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-04 16:21:13 +01:00
Danielle Maywood f41275eb39 feat(agent/agentcontainers): auto detect dev containers (#18950)
Relates to https://github.com/coder/internal/issues/711

This PR implements a project discovery mechanism that searches for any
dev container projects and makes them visible in the UI so that they can
be started. To make the wording on the site more clear, "Rebuild" has
been changed to "Start" when there is no container associated with a
known dev container configuration. I've also made it so that site will
show the dev container config path when there is no other name
available.

### Design decisions

Just want to ensure my explanation for a few design decisions are noted
down:
- We only search for dev container configurations inside git
repositories
- We only search for these git repositories if they're at the top level
or a direct child of the agent directory.

This limited approach is to reduce the amount of files we ultimately
walk when trying to find these projects. It makes sense to limit it to
only the agent directory, although I'm open to expanding how deep we
search.
2025-07-22 19:02:43 +01:00
Mathias Fredriksson 99d124e276 feat(agent): enable devcontainers by default (#18533) 2025-06-24 21:17:04 +03:00
Mathias Fredriksson fca99174ad feat(agent/agentcontainers): implement sub agent injection (#18245)
This change adds support for sub agent creation and injection into dev
containers.

Updates coder/internal#621
2025-06-10 12:37:54 +03:00
Danielle Maywood 83df55700b revert(agent): remove CODER_AGENT_IS_SUB_AGENT cli flag (#17875)
The RFC has changed, this information will be passed through the
manifest instead.
2025-05-16 11:04:21 +00:00
Sas Swart 425ee6fa55 feat: reinitialize agents when a prebuilt workspace is claimed (#17475)
This pull request allows coder workspace agents to be reinitialized when
a prebuilt workspace is claimed by a user. This facilitates the transfer
of ownership between the anonymous prebuilds system user and the new
owner of the workspace.

Only a single agent per prebuilt workspace is supported for now, but
plumbing has already been done to facilitate the seamless transition to
multi-agent support.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:15:36 +02:00
Danielle Maywood 7f056da088 feat: add hidden CODER_AGENT_IS_SUB_AGENT flag to coder agent (#17783)
Closes https://github.com/coder/internal/issues/620

Adds a new, hidden, flag `CODER_AGENT_IS_SUB_AGENT` to the `coder agent`
command.
2025-05-13 10:57:50 +01:00
Mathias Fredriksson 1fc74f629e refactor(agent): update agentcontainers api initialization (#17600)
There were too many ways to configure the agentcontainers API resulting
in inconsistent behavior or features not being enabled. This refactor
introduces a control flag for enabling or disabling the containers API.
When disabled, all implementations are no-op and explicit endpoint
behaviors are defined. When enabled, concrete implementations are used
by default but can be overridden by passing options.
2025-04-29 17:53:10 +03:00
Cian Johnston 8d122aa4ab chore(cli): avoid use of testutil.RandomPort() in prometheus test (#17297)
Should hopefully fix https://github.com/coder/internal/issues/282

Instead of picking a random port for the prometheus server, listen on
`:0` and read the port from the CLI stdout.
2025-04-09 09:20:47 +01:00
Jon Ayers 17ddee05e5 chore: update golang to 1.24.1 (#17035)
- Update go.mod to use Go 1.24.1
- Update GitHub Actions setup-go action to use Go 1.24.1
- Fix linting issues with golangci-lint by:
  - Updating to golangci-lint v1.57.1 (more compatible with Go 1.24.1)

🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2025-03-26 01:56:39 -05:00
Mathias Fredriksson dfcd93b26e feat: enable agent connection reports by default, remove flag (#16778)
This change enables agent connection reports by default and removes the
experimental flag for enabling them.

Updates #15139
2025-03-03 18:37:28 +02:00
Cian Johnston ec44f06f5c feat(cli): allow SSH command to connect to running container (#16726)
Fixes https://github.com/coder/coder/issues/16709 and
https://github.com/coder/coder/issues/16420

Adds the capability to`coder ssh` into a running container if `CODER_AGENT_DEVCONTAINERS_ENABLE=true`.

Notes:
* SFTP is currently not supported
* Haven't tested X11 container forwarding
* Haven't tested agent forwarding
2025-02-28 09:38:45 +00:00
Mathias Fredriksson 4ba5a8a2ba feat(agent): add connection reporting for SSH and reconnecting PTY (#16652)
Updates #15139
2025-02-27 10:45:45 +00:00
Cian Johnston 172e52317c feat(agent): wire up agentssh server to allow exec into container (#16638)
Builds on top of https://github.com/coder/coder/pull/16623/ and wires up
the ReconnectingPTY server. This does nothing to wire up the web
terminal yet but the added test demonstrates the functionality working.

Other changes:
* Refactors and moves the `SystemEnvInfo` interface to the
`agent/usershell` package to address follow-up from
https://github.com/coder/coder/pull/16623#discussion_r1967580249
* Marks `usershellinfo.Get` as deprecated. Consumers should use the
`EnvInfoer` interface instead.

---------

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
Co-authored-by: Danny Kopping <danny@coder.com>
2025-02-26 09:03:27 +00:00
Cian Johnston ec50a35c08 chore(cli): disable agent devcontainer integration by default (#16531)
Until we have more of the building blocks in place, disable the agent
devcontainer integration by default. We'll enable it by default at a
later date.
2025-02-12 10:47:25 +00:00
Cian Johnston 35901028d2 feat(agent): add CODER_AGENT_DEVCONTAINERS_ENABLE option (#16525) 2025-02-11 15:29:59 +00:00
Jon Ayers ce573b9faa fix: add agent exec abstraction (#15717) 2024-12-04 23:30:25 +02:00
Jon Ayers 1f238fed59 feat: integrate new agentexec pkg (#15609)
- Integrates the `agentexec` pkg into the agent and removes the
legacy system of iterating over the process tree. It adds some linting
rules to hopefully catch future improper uses of `exec.Command` in the package.
2024-11-27 20:12:15 +02:00
Ethan e72d58b4f6 fix: guard server log lumberjack with mutex (#15582)
(Hopefully) closes https://github.com/coder/internal/issues/213.
2024-11-19 19:47:35 +11:00
Ethan e65eb0321c fix: support additional http headers on agent (#14464) 2024-08-29 14:15:15 +10:00
Marcin Tojek e96652ebbc feat: block file transfers for security (#13501) 2024-06-10 12:12:23 +00:00
Jon Ayers 426e9f2b96 feat: support adjusting child proc oom scores (#12655) 2024-04-03 09:42:03 -05:00
Ammar Bandukwala b4c0fa80d8 chore(cli): rename Cmd to Command (#12616)
I think Command is cleaner and my original decision to use "Cmd"
a mistake.

Plus this creates better parity with cobra.
2024-03-17 09:45:26 -05:00
Ammar Bandukwala 496232446d chore(cli): replace clibase with external coder/serpent (#12252) 2024-03-15 11:24:38 -05:00
Cian Johnston b0c4e7504c feat(support): add client magicsock and agent prometheus metrics to support bundle (#12604)
* feat(codersdk): add ability to fetch prometheus metrics directly from agent
* feat(support): add client magicsock and agent prometheus metrics to support bundle
* refactor(support): simplify AgentInfo control flow

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2024-03-15 15:33:49 +00:00
Kyle Carberry 895df54051 fix: separate signals for passive, active, and forced shutdown (#12358)
* fix: separate signals for passive, active, and forced shutdown

`SIGTERM`: Passive shutdown stopping provisioner daemons from accepting new
jobs but waiting for existing jobs to successfully complete.

`SIGINT` (old existing behavior): Notify provisioner daemons to cancel in-flight jobs, wait 5s for jobs to be exited, then force quit.

`SIGKILL`: Untouched from before, will force-quit.

* Revert dramatic signal changes

* Rename

* Fix shutdown behavior for provisioner daemons

* Add test for graceful shutdown
2024-03-15 13:16:36 +00:00
Mathias Fredriksson 4ce1448bbe fix(cli): generate correctly named file in DumpHandler (#12409) 2024-03-04 18:35:33 +02:00
Mathias Fredriksson b1c0b39d88 feat(agent): add script data dir for binaries and files (#12205)
The agent is extended with a `--script-data-dir` flag, defaulting to the
OS temp dir. This dir is used for storing `coder-script-data/bin` and
`coder-script/[script uuid]`. The former is a place for all scripts to
place executable binaries that will be available by other scripts, SSH
sessions, etc. The latter is a place for the script to store files.

Since we default to OS temp dir, files are ephemeral by default. In the
future, we may consider adding new env vars or changing the default
storage location. Workspace startup speed could potentially benefit from
scripts being able to skip steps that require downloading software. We
may also extend this with more env variables (e.g. persistent storage in
HOME).

Fixes #11131
2024-02-20 13:26:18 +02:00
Mathias Fredriksson c63f569174 refactor(agent/agentssh): move envs to agent and add agentssh config struct (#12204)
This commit refactors where custom environment variables are set in the
workspace and decouples agent specific configs from the `agentssh.Server`.
To reproduce all functionality, `agentssh.Config` is introduced.

The custom environment variables are now configured in `agent/agent.go`
and the agent retains control of the final state. This will allow for
easier extension in the future and keep other modules decoupled.
2024-02-19 16:30:00 +02:00
Ammar Bandukwala cfe35f54b4 feat(cli/agent): preserve old logs (#10776)
See https://github.com/coder/coder/pull/7815 for background.
2023-11-18 10:53:56 -06:00
Spike Curtis 4894eda711 feat: capture cli logs in tests (#10669)
Adds a Logger to cli Invocation and standardizes CLI commands to use it.  clitest creates a test logger by default so that CLI command logs are captured in the test logs.

CLI commands that do their own log configuration are modified to add sinks to the existing logger, rather than create a new one.  This ensures we still capture logs in CLI tests.
2023-11-14 22:56:27 +04:00
Spike Curtis f400d8a0c5 fix: handle SIGHUP from OpenSSH (#10638)
Fixes an issue where remote forwards are not correctly torn down when using OpenSSH with `coder ssh --stdio`.  OpenSSH sends a disconnect signal, but then also sends SIGHUP to `coder`.  Previously, we just exited when we got SIGHUP, and this raced against properly disconnecting.

Fixes https://github.com/coder/customers/issues/327
2023-11-13 15:14:42 +04:00
Kyle Carberry ad47ef17e8 feat: allow reading the agent token from a file (#10080)
Adds `CODER_AGENT_TOKEN_FILE` which will read the agent token from
a file if `CODER_AGENT_TOKEN` is not provided. Using a Kubernetes
Secret with a volume-mounted file is a more secure way to provide
the agent token instead of an environment variable.
2023-10-05 15:41:05 -05:00
Jon Ayers 7311ffbd9d feat: implement agent process management (#9461)
- An opt-in feature has been added to the agent to allow
   deprioritizing non coder-related processes for CPU by setting their
   niceness level to 10.
- Opting in to the feature requires setting CODER_PROC_PRIO_MGMT to a non-empty value.
2023-09-14 19:45:05 -05:00
Kyle Carberry 22e781eced chore: add /v2 to import module path (#9072)
* chore: add /v2 to import module path

go mod requires semantic versioning with versions greater than 1.x

This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```

Migrate generated files to import /v2

* Fix gen
2023-08-18 18:55:43 +00:00
Dean Sheather 07fd73c4a0 chore: allow multiple agent subsystems, add exectrace (#8933) 2023-08-08 22:10:28 -07:00
Cian Johnston 7fcf319e01 fix(cli)!: protect client Logger and refactor cli scaletest tests (#8317)
- (breaking) Protects Logger and LogBodies fields of codersdk.Client with its mutex. This addresses a data race in cli/scaletest.
- Fillets the existing cli/createworkspaces unit test and moves the testing logic there into the tests under scaletest/createworkspaces.
- Adds testutil.RaceEnabled bool const and conditionaly skips previously-skipped tests under scaletest/ if the race detector is enabled. This is unfortunate and sad, but I would prefer to have these tests at least running without the race detector than not running at all.
- Adds IgnoreErrors option to fake in-memory agent loggers; having the agents fail the test immediately when they encounter any sort of error isn't really helpful.
2023-07-06 09:43:39 +01:00
Mathias Fredriksson b4751c72d8 fix(cli/agent): wrap lumberjack logger to prevent re-open (#8229) 2023-06-27 12:49:44 +00:00
Marcin Tojek b1d1b63113 chore: ensure logs consistency across Coder (#8083) 2023-06-20 12:30:45 +02:00
Marcin Tojek 247f8a973f feat: replace ssh maxTimeout with keep-alive mechanism (#8062)
* Bump up coder/ssh

* feat: Set default agent timeout to ~72h

* Address PR comments

* Fix
2023-06-16 15:22:18 +02:00