Commit Graph

85 Commits

Author SHA1 Message Date
Steven Masley c339aa99ee chore: use header auth over cookies for agents (#22226) (cherry-pick/v2.31) (#22313)
All non-browser connections should not use cookies

(cherry picked from commit 3353e687e7)
2026-02-26 11:11:00 -06:00
Kacper Sawicki f016d9e505 fix(coderd): add role param to agent RPC to prevent false connectivity (#22052)
## Summary

coder-logstream-kube and other tools that use the agent token to connect
to the RPC endpoint were incorrectly triggering connection monitoring,
causing false connected/disconnected timestamps on the agent. This led
to VSCode/JetBrains disconnections and incorrect dashboard status.

## Changes

Add a `role` query parameter to `/api/v2/workspaceagents/me/rpc`:
- `role=agent`: triggers connection monitoring (default for the agent
SDK)
- any other value (e.g. `logstream-kube`): skips connection monitoring
- omitted: triggers monitoring for backward compatibility with older
agents

The agent SDK now sends `role=agent` by default. A new `Role` field on
the `agentsdk.Client` allows non-agent callers to specify a different
role.

## Required follow-up

coder-logstream-kube needs to set `client.Role = "logstream-kube"`
before calling `ConnectRPC20()`. Without that change, it will still send
`role=agent` and trigger monitoring.

Fixes #21625
2026-02-18 09:44:06 +01:00
Danielle Maywood 2de8cdf160 feat(agent): add subagent ID fields to devcontainers in manifest (#21848)
Update the agent protobuf schema (agent/proto/agent.proto) to include:
- subagent_id field in WorkspaceAgentDevcontainer message
- id field in CreateSubAgentRequest message

Bump the Agent API version from v2.7 to v2.8 and update all client
references throughout the codebase (ConnectRPC27 -> ConnectRPC28,
DRPCAgentClient27 -> DRPCAgentClient28).
2026-02-03 12:37:30 +00:00
Spike Curtis bddb808b25 chore: arrange imports in a standard way (#21452)
Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:

```
import (
	"context"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"golang.org/x/xerrors"
	"gopkg.in/natefinch/lumberjack.v2"

	"cdr.dev/slog/v3"
	"github.com/coder/coder/v2/codersdk/agentsdk"
	"github.com/coder/serpent"
)
```

3 groups: standard library, 3rd partly libs, Coder libs.

This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
2026-01-08 15:24:11 +04:00
Spike Curtis 49b34a716a fix: fix slog to always use array of Fields (#21426)
Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder).

It also updates dependencies that also use slog and were updated.

I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule.

Other dependencies, I pushed new tags.
2026-01-08 10:29:41 +04:00
Zach 9d1493a13a feat: add initial API for boundary log forwarding to coderd (#21293)
Add the AgentAPI changes to support the feature that transmits boundary
logs from workspaces to coderd via the agent API for eventual re-emission to
stderr. The API handlers are stubs for now because I'm trying to land
this feature from multiple smaller PRs.

High level architecture:
- Boundary records resource access in batches and sends proto message to
agent
- Agent proxies messages to coderd **(captured by the API changes in
this PR)**
- coderd re-emits logs to stderr

RFC:
https://www.notion.so/coderhq/Agent-Boundary-Logs-2afd579be59280f29629fc9823ac41ba
2025-12-19 10:41:39 -07:00
Spike Curtis dc21699e8c fix: use correct slog arguments (#20721)
Fixes a bad slog.Error() command that didn't wrap the error in `slog.Error`
2025-11-11 22:44:48 +04:00
Spike Curtis 5c2b9a5b82 chore: refactor codersdk.Client creation with functional args (#19759)
Adds ClientBuilder to build a codersdk.Client. This is a safer pattern than the current usage which modifies properties of the Client after creating it, opening us up to race conditions.

Refactors agentsdk to use the builder.
2025-09-22 15:59:41 +04:00
Spike Curtis 18945a7949 chore: refactor CLI agent auth tests as unit tests (#19609)
Fixes https://github.com/coder/internal/issues/933

Refactors CLI tests that check the `--auth` flag parsing for various public clouds into a unit test that just creates the agent Client and asserts on the type.

Testing that the agent client actually authenticates correctly with these auth types is well covered by Coderd tests, so we don't need to retread that ground here, and the deleted tests were flaky on Windows.
2025-09-03 10:49:19 +04:00
Spike Curtis 1354d84eb4 chore: refactor instance identity to be a SessionTokenProvider (#19566)
Refactors Agent instance identity to be a SessionTokenProvider.

Refactors the CLI to create Agent clients via a centralized function, rather than add-hoc via individual command handlers and their flags.

This allows commands besides `coder agent`, but which still use the agent identity, to support instance identity authentication.

Fixes #19111 by unifying all API requests to go thru the SessionTokenProvider for auth credentials.
2025-09-03 10:38:42 +04:00
Dean Sheather a1b87a67c6 fix: use client preferred URL for the default DERP (#18911)
The agentsdk currently does a remap of the DERP map to change the
EmbeddedRelay node's URL to match the agent's access URL.

This PR makes changes to the `workspacesdk` (used by clients like the
CLI) and `vpn` (used by Coder Desktop) to match this behavior.

This enables us the ability to try Coder clients in dogfood over a VPN
without changing the global access URL.
2025-07-17 20:17:44 +10:00
Ethan 7a339a1ffe feat: add connectionlogs API (#18628)
This is the second PR for moving connection events out of the audit log.

This PR:
- Adds the `/api/v2/connectionlog` endpoint
- Adds filtering for `GetAuthorizedConnectionLogsOffset` and thus the endpoint. 
There's quite a few, but I was aiming for feature parity with the audit log.
  1. `organization:<id|name>`
  2. `workspace_owner:<username>`
  3. `workspace_owner_email:<email>`
  4. `type:<ssh|vscode|jetbrains|reconnecting_pty|workspace_app|port_forwarding>`
  5. `username:<username>` 
     - Only includes web-based connection events (workspace apps, web port forwarding) as only those include user metadata.
  6. `user_email:<email>`
  7. `connected_after:<time>`
  8. `connected_before:<time>`
  9. `workspace_id:<id>`
  10. `connection_id:<id>`
      - If you have one snapshot of the connection log, and some sessions are ongoing in that snapshot, you could use this filter to check if they've been closed since.
  11. `status:<connected|disconnected>`
       - If `connected` only sessions with a null `close_time` are returned, if `disconnected`, only those with a non-null `close_time`. If filter is omitted, both are returned.
       
Future PRs:
- Populate `count` on `ConnectionLogResponse` using a seperate query (to preemptively mitigate the issue described in #17689)
- Implement a table in the Web UI for viewing connection logs.
- Write a query to delete old events from the audit log, call it from dbpurge.
- Write documentation for the endpoint / feature (including these filters)
2025-07-15 14:55:34 +10:00
Mathias Fredriksson ae0c8701bb feat(agent): disable devcontainers for sub agents (#18303)
Updates coder/internal#621
Refs #18245
2025-06-10 10:47:02 +00:00
Bruno Quaresma d779126ee3 chore: rollback PR #18081 (#18104)
Rollback https://github.com/coder/coder/pull/18081
2025-05-29 13:12:13 -03:00
Danielle Maywood b712d0b23f feat(coderd/agentapi): implement sub agent api (#17823)
Closes https://github.com/coder/internal/issues/619

Implement the `coderd` side of the AgentAPI for the upcoming
dev-container agents work.

`agent/agenttest/client.go` is left unimplemented for a future PR
working to implement the agent side of this feature.
2025-05-29 12:15:47 +01:00
Bruno Quaresma 2ec7404197 chore: make owner_name and owner_username consistent (#18081)
We've been using owner_name inconsistently as username. So this PR fixes
it by making the attribute naming more consistent.
2025-05-28 17:25:32 -03:00
Danielle Maywood 61f22a59ba feat(agent): add ParentId to agent manifest (#17888)
Closes https://github.com/coder/internal/issues/648

This change introduces a new `ParentId` field to the agent's manifest.
This will allow an agent to know if it is a child or not, as well as
knowing who the owner is.

This is part of the Dev Container Agents work
2025-05-19 16:09:56 +01:00
Sas Swart 425ee6fa55 feat: reinitialize agents when a prebuilt workspace is claimed (#17475)
This pull request allows coder workspace agents to be reinitialized when
a prebuilt workspace is claimed by a user. This facilitates the transfer
of ownership between the anonymous prebuilds system user and the new
owner of the workspace.

Only a single agent per prebuilt workspace is supported for now, but
plumbing has already been done to facilitate the seamless transition to
multi-agent support.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:15:36 +02:00
Steven Masley 37832413ba chore: resolve internal drpc package conflict (#17770)
Our internal drpc package name conflicts with the external one in usage. 
`drpc.*` == external
`drpcsdk.*` == internal
2025-05-12 10:31:38 -05:00
Cian Johnston 979687c37f chore(codersdk): deprecate WorkspaceAppStatus.{NeedsUserAttention,Icon} (#17358)
https://github.com/coder/coder/pull/17163 introduced the
`workspace_app_statuses` table. Two of these fields
(`needs_user_attention`, `icon`) turned out to be surplus to
requirements.

- Removes columns `needs_user_attention` and `icon` from
`workspace_app_statuses`
- Marks the corresponding fields of `codersdk.WorkspaceAppStatus` as
deprecated.
2025-04-15 10:47:42 +01:00
Kyle Carberry 8ea956fc11 feat: add app status tracking to the backend (#17163)
This does ~95% of the backend work required to integrate the AI work.

Most left to integrate from the tasks branch is just frontend, which
will be a lot smaller I believe.

The real difference between this branch and that one is the abstraction
-- this now attaches statuses to apps, and returns the latest status
reported as part of a workspace.

This change enables us to have a similar UX to in the tasks branch, but
for agents other than Claude Code as well. Any app can report status
now.
2025-03-31 10:55:44 -04:00
Mathias Fredriksson 69ba27e347 feat: allow specifying devcontainer on agent in terraform (#16997)
This change allows specifying devcontainers in terraform and plumbs it
through to the agent via agent manifest.

This will be used for autostarting devcontainers in a workspace.

Depends on coder/terraform-provider-coder#368
Updates #16423
2025-03-20 19:09:39 +02:00
Mathias Fredriksson b07b33ec9d feat: add agentapi endpoint to report connections for audit (#16507)
This change adds a new `ReportConnection` endpoint to the `agentapi`.

The protocol version was bumped previously, so it has been omitted here.

This allows the agent to report connection events, for example when the
user connects to the workspace via SSH or VS Code.

Updates #15139
2025-02-20 14:52:01 +02:00
Vincent Vielle bc609d0056 feat: integrate agentAPI with resources monitoring logic (#16438)
As part of the new resources monitoring logic - more specifically for
OOM & OOD Notifications , we need to update the AgentAPI , and the
agents logic.

This PR aims to do it, and more specifically :  
We are updating the AgentAPI & TailnetAPI to version 24 to add two new
methods in the AgentAPI :
- One method to fetch the resources monitoring configuration
- One method to push the datapoints for the resources monitoring.

Also, this PR adds a new logic on the agent side, with a routine running
and ticking - fetching the resources usage each time , but also storing
it in a FIFO like queue.

Finally, this PR fixes a problem we had with RBAC logic on the resources
monitoring model, applying the same logic than we have for similar
entities.
2025-02-14 10:28:15 +01:00
Spike Curtis 2c7f8ac65f chore: migrate to coder/websocket 1.8.12 (#15898)
Migrates us to `coder/websocket` v1.8.12 rather than `nhooyr/websocket` on an older version.

Works around https://github.com/coder/websocket/issues/504 by adding an explicit test for `xerrors.Is(err, io.EOF)` where we were previously getting `io.EOF` from the netConn.
2024-12-19 00:51:30 +04:00
Spike Curtis 40802958e9 fix: use explicit api versions for agent and tailnet (#15508)
Bumps the Tailnet and Agent API version 2.3, and creates some extra controls and machinery around these versions.

What happened is that we accidentally shipped two new API features without bumping the version.  `ScriptCompleted` on the Agent API in Coder v2.16 and `RefreshResumeToken` on the Tailnet API in Coder v2.15.

Since we can't easily retroactively bump the versions, we'll roll these changes into API version 2.3 along with the new WorkspaceUpdates RPC, which hasn't been released yet.  That means there is some ambiguity in Coder v2.15-v2.17 about exactly what methods are supported on the Tailnet and Agent APIs.  This isn't great, but hasn't caused us major issues because 

1. RefreshResumeToken is considered optional, and clients just log and move on if the RPC isn't supported. 
2. Agents basically never get started talking to a Coderd that is older than they are, since the agent binary is normally downloaded from Coderd at workspace start.

Still it's good to get things squared away in terms of versions for SDK users and possible edge cases around client and server versions.

To mitigate against this thing happening again, this PR also:

1. adds a CODEOWNERS for the API proto packages, so I'll review changes
2. defines interface types for different API versions, and has the agent explicitly use a specific version.  That way, if you add a new method, and try to use it in the agent without thinking explicitly about versions, it won't compile.

With the protocol controllers stuff, we've sort of already abstracted the Tailnet API such that the interface type strategy won't work, but I'll work on getting the Controller to be version aware, such that it can check the API version it's getting against the controllers it has -- in a later PR.
2024-11-15 11:16:28 +04:00
Spike Curtis f6639b788f feat: add ConnectRPC variants for older Agent API versions (#13778) 2024-07-03 16:11:05 +04:00
Ethan dd243686e4 chore!: remove deprecated agent v1 routes (#13486) 2024-06-11 12:22:59 +10:00
Colin Adler 9d00a26a90 fix: add missing route for codersdk.PostLogSource (#13421) 2024-06-03 12:29:50 -05:00
Spike Curtis e76b595052 fix: use a native websocket.NetConn for agent RPC client (#13142)
One cause of #13139 is a peculiar failure mode of `WebsocketNetConn` which causes it to return `context.Canceled` in some circumstances when the underlying websocket fails.  We have special processing for that error in the `agent.run()` routine, which is erroneously being triggered.

Since we don't actually need the returned context from `WebsocketNetConn`, we can simplify and just use the netConn from the `websocket` library directly.
2024-05-06 15:00:34 +04:00
Spike Curtis b0afffbafb feat: use v2 API for agent metadata updates (#12281)
Switches the agent to report metadata over the v2 API.

Fixes #10534
2024-02-26 09:50:19 +04:00
Spike Curtis aa7a9f5cc4 feat: use v2 API for agent lifecycle updates (#12278)
Agent uses the v2 API to post lifecycle updates.

Part of #10534
2024-02-23 15:24:28 +04:00
Spike Curtis 4cc132cea0 feat: switch agent to use v2 API for sending logs (#12068)
Changes the agent to use the new v2 API for sending logs, via the logSender component.

We keep the PatchLogs function around, but deprecate it so that we can test the v1 endpoint.
2024-02-23 11:27:15 +04:00
Dean Sheather 2fc3064653 chore: add tests for app ID copy in app healths (#12088) 2024-02-12 05:49:48 +00:00
Spike Curtis 1f5a6d59ba chore: consolidate websocketNetConn implementations (#12065)
Consolidates websocketNetConn from multiple packages in favor of a central one in codersdk
2024-02-09 11:39:08 +04:00
Spike Curtis 151aaadc23 fix: allow startup scripts larger than 32k (#12060)
Fixes #12057 and adds a regression test.
2024-02-07 22:26:42 +04:00
Spike Curtis 1cf4b62867 feat: change agent to use v2 API for reporting stats (#12024)
Modifies the agent to use the v2 API to report its statistics, using the `statsReporter` subcomponent.
2024-02-07 15:26:41 +04:00
Spike Curtis 1aa117b9ec chore: rename client Listen to ConnectRPC (#11916)
ConnectRPC seems more appropriate for this function
2024-02-01 14:44:11 +04:00
Spike Curtis 073d1f7078 chore: remove pingWebSocket since yamux runs keepalives (#11914)
Since we run yamux over the websocket, we don't need to ping at the websocket layer because yamux has a 30 second keepalive mechanism enabled in the default config.
2024-02-01 09:48:58 +04:00
Spike Curtis 13e214f7f1 feat: add logging to agent yamux session (#11912)
Log yamux errors and warnings in the agent.
2024-02-01 08:18:13 +04:00
Spike Curtis 0fc177203e feat: use agent v2 API to update app health (#11889)
Use the Agent v2 API to update App Health
2024-01-30 11:35:12 +04:00
Spike Curtis 2599850e54 feat: use agent v2 API to post startup (#11877)
Uses the v2 Agent API to post startup information.
2024-01-30 11:23:28 +04:00
Spike Curtis da8bb1c198 feat: use agent v2 API to fetch manifest (#11832)
Agent uses the v2 API to obtain the manifest, instead of the HTTP API.
2024-01-30 10:11:28 +04:00
Spike Curtis 13e24f21e4 feat: use Agent v2 API for Service Banner (#11806)
Agent uses the v2 API for the service banner, rather than the v1 HTTP API.

One of several for #10534
2024-01-30 07:44:47 +04:00
Spike Curtis 059e533544 feat: agent uses Tailnet v2 API for DERPMap updates (#11698)
Switches the Agent to use Tailnet v2 API to get DERPMap updates.

Subsequent PRs will do the same for the CLI (`codersdk`) and `wsproxy`.
2024-01-23 14:42:07 +04:00
Spike Curtis 3e0e7f8739 feat: check agent API version on connection (#11696)
fixes #10531

Adds a check for `version` on connection to the Agent API websocket endpoint.  This is primarily for future-proofing, so that up-level agents get a sensible error if they connect to a back-level Coderd.

It also refactors the location of the `CurrentVersion` variables, to be part of the `proto` packages, since the versions refer to the APIs defined therein.
2024-01-23 14:27:49 +04:00
Spike Curtis f01cab9894 feat: use tailnet v2 API for coordination (#11638)
This one is huge, and I'm sorry.

The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet.

There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!
2024-01-22 11:07:50 +04:00
Mathias Fredriksson df3c310379 feat(cli): add coder open vscode (#11191)
Fixes #7667
2024-01-02 20:46:18 +02:00
Szabolcs Fruhwald baf3bf6b9c feat: add workspace_id, owner_name to agent manifest (#10199)
Co-authored-by: Kyle Carberry <kyle@carberry.com>
Co-authored-by: Atif Ali <atif@coder.com>
2023-12-04 00:41:54 +03:00
Jon Ayers 51b58cfc98 fix: only update last_used_at when connection count > 0 (#10808) 2023-11-21 18:10:41 -06:00