feat: implement agent socket api, client and cli (#20758)

closes: https://github.com/coder/coder/issues/10352
closes: https://github.com/coder/internal/issues/1094
closes: https://github.com/coder/internal/issues/1095

In this pull request, we enable a new set of experimental cli commands
grouped under `coder exp sync`.
These commands allow any process acting within a coder workspace to
inform the coder agent of its requirements and execution progress. The
coder agent will then relay this information to other processes that
have subscribed.

These commands are:
```
# Check if this feature is enabled in your environment 
coder exp sync ping

# express that your unit depends on another
coder exp sync want <unit> <dependency_unit> 

# express that your unit intends to start a portion of the script that requires 
# other units to have completed first. This command blocks until all dependencies have been met
coder exp sync start <unit> 

# express that your unit has completes its work, allowing dependent units to begin their execution
coder exp sync complete <unit>
```

Example:

In order to automatically run claude code in a new workspace, it must
first have a git repository cloned. The scripts responsible for cloning
the repository and for running claude code would coordinate in the
following way:

```bash
# Script A: Claude code

# Inform the agent that the claude script wants the git script.
# That is, the git script must have completed before the claude script can begin its execution
coder exp sync want claude git

# Inform the agent that we would now like to begin execution of claude.
# This command will block until the git script (and any other defined dependencies)
# have completed
coder exp sync start claude

# Now we run claude code and any other commands we need
claude ...

# Once our script has completed, we inform the agent, so that any scripts that depend on this one
# may begin their execution

coder exp sync complete claude
```

```bash
# Script B: Git

# Because the git script does not have any dependencies, we can simply inform the agent that we 
# intend to start
coder exp sync start git

git clone ssh://git@github.com/coder/coder

# Once the repository have been cloned, we inform the agent that this script is complete, so that
# scripts that depend on it may begin their execution.
coder exp sync complete git
```

Notes:
* Unit names (ie. `claude` and `git`) given as input to the sync
commands are arbitrary strings. You do not have to conform to specific
identifiers. We recommend naming your scripts descriptively, but
succinctly.
* Scripts unit names should be well documented. Other scripts will need
to know the names you've chosen in order to depend on yours. Therefore,
you

---------

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
This commit is contained in:
Sas Swart
2025-11-28 08:33:50 +02:00
committed by GitHub
parent ee58f40cad
commit ce627bf23f
37 changed files with 1314 additions and 277 deletions
+39
View File
@@ -41,6 +41,7 @@ import (
"github.com/coder/coder/v2/agent/agentcontainers"
"github.com/coder/coder/v2/agent/agentexec"
"github.com/coder/coder/v2/agent/agentscripts"
"github.com/coder/coder/v2/agent/agentsocket"
"github.com/coder/coder/v2/agent/agentssh"
"github.com/coder/coder/v2/agent/proto"
"github.com/coder/coder/v2/agent/proto/resourcesmonitor"
@@ -97,6 +98,8 @@ type Options struct {
Devcontainers bool
DevcontainerAPIOptions []agentcontainers.Option // Enable Devcontainers for these to be effective.
Clock quartz.Clock
SocketServerEnabled bool
SocketPath string // Path for the agent socket server socket
}
type Client interface {
@@ -202,6 +205,8 @@ func New(options Options) Agent {
devcontainers: options.Devcontainers,
containerAPIOptions: options.DevcontainerAPIOptions,
socketPath: options.SocketPath,
socketServerEnabled: options.SocketServerEnabled,
}
// Initially, we have a closed channel, reflecting the fact that we are not initially connected.
// Each time we connect we replace the channel (while holding the closeMutex) with a new one
@@ -279,6 +284,10 @@ type agent struct {
devcontainers bool
containerAPIOptions []agentcontainers.Option
containerAPI *agentcontainers.API
socketServerEnabled bool
socketPath string
socketServer *agentsocket.Server
}
func (a *agent) TailnetConn() *tailnet.Conn {
@@ -358,9 +367,32 @@ func (a *agent) init() {
s.ExperimentalContainers = a.devcontainers
},
)
a.initSocketServer()
go a.runLoop()
}
// initSocketServer initializes server that allows direct communication with a workspace agent using IPC.
func (a *agent) initSocketServer() {
if !a.socketServerEnabled {
a.logger.Info(a.hardCtx, "socket server is disabled")
return
}
server, err := agentsocket.NewServer(
a.logger.Named("socket"),
agentsocket.WithPath(a.socketPath),
)
if err != nil {
a.logger.Warn(a.hardCtx, "failed to create socket server", slog.Error(err), slog.F("path", a.socketPath))
return
}
a.socketServer = server
a.logger.Debug(a.hardCtx, "socket server started", slog.F("path", a.socketPath))
}
// runLoop attempts to start the agent in a retry loop.
// Coder may be offline temporarily, a connection issue
// may be happening, but regardless after the intermittent
@@ -1928,6 +1960,7 @@ func (a *agent) Close() error {
lifecycleState = codersdk.WorkspaceAgentLifecycleShutdownError
}
}
a.setLifecycle(lifecycleState)
err = a.scriptRunner.Close()
@@ -1935,6 +1968,12 @@ func (a *agent) Close() error {
a.logger.Error(a.hardCtx, "script runner close", slog.Error(err))
}
if a.socketServer != nil {
if err := a.socketServer.Close(); err != nil {
a.logger.Error(a.hardCtx, "socket server close", slog.Error(err))
}
}
if err := a.containerAPI.Close(); err != nil {
a.logger.Error(a.hardCtx, "container API close", slog.Error(err))
}
+146
View File
@@ -0,0 +1,146 @@
package agentsocket
import (
"context"
"golang.org/x/xerrors"
"storj.io/drpc"
"storj.io/drpc/drpcconn"
"github.com/coder/coder/v2/agent/agentsocket/proto"
"github.com/coder/coder/v2/agent/unit"
)
// Option represents a configuration option for NewClient.
type Option func(*options)
type options struct {
path string
}
// WithPath sets the socket path. If not provided or empty, the client will
// auto-discover the default socket path.
func WithPath(path string) Option {
return func(opts *options) {
if path == "" {
return
}
opts.path = path
}
}
// Client provides a client for communicating with the workspace agentsocket API.
type Client struct {
client proto.DRPCAgentSocketClient
conn drpc.Conn
}
// NewClient creates a new socket client and opens a connection to the socket.
// If path is not provided via WithPath or is empty, it will auto-discover the
// default socket path.
func NewClient(ctx context.Context, opts ...Option) (*Client, error) {
options := &options{}
for _, opt := range opts {
opt(options)
}
conn, err := dialSocket(ctx, options.path)
if err != nil {
return nil, xerrors.Errorf("connect to socket: %w", err)
}
drpcConn := drpcconn.New(conn)
client := proto.NewDRPCAgentSocketClient(drpcConn)
return &Client{
client: client,
conn: drpcConn,
}, nil
}
// Close closes the socket connection.
func (c *Client) Close() error {
return c.conn.Close()
}
// Ping sends a ping request to the agent.
func (c *Client) Ping(ctx context.Context) error {
_, err := c.client.Ping(ctx, &proto.PingRequest{})
return err
}
// SyncStart starts a unit in the dependency graph.
func (c *Client) SyncStart(ctx context.Context, unitName unit.ID) error {
_, err := c.client.SyncStart(ctx, &proto.SyncStartRequest{
Unit: string(unitName),
})
return err
}
// SyncWant declares a dependency between units.
func (c *Client) SyncWant(ctx context.Context, unitName, dependsOn unit.ID) error {
_, err := c.client.SyncWant(ctx, &proto.SyncWantRequest{
Unit: string(unitName),
DependsOn: string(dependsOn),
})
return err
}
// SyncComplete marks a unit as complete in the dependency graph.
func (c *Client) SyncComplete(ctx context.Context, unitName unit.ID) error {
_, err := c.client.SyncComplete(ctx, &proto.SyncCompleteRequest{
Unit: string(unitName),
})
return err
}
// SyncReady requests whether a unit is ready to be started. That is, all dependencies are satisfied.
func (c *Client) SyncReady(ctx context.Context, unitName unit.ID) (bool, error) {
resp, err := c.client.SyncReady(ctx, &proto.SyncReadyRequest{
Unit: string(unitName),
})
return resp.Ready, err
}
// SyncStatus gets the status of a unit and its dependencies.
func (c *Client) SyncStatus(ctx context.Context, unitName unit.ID) (SyncStatusResponse, error) {
resp, err := c.client.SyncStatus(ctx, &proto.SyncStatusRequest{
Unit: string(unitName),
})
if err != nil {
return SyncStatusResponse{}, err
}
var dependencies []DependencyInfo
for _, dep := range resp.Dependencies {
dependencies = append(dependencies, DependencyInfo{
DependsOn: unit.ID(dep.DependsOn),
RequiredStatus: unit.Status(dep.RequiredStatus),
CurrentStatus: unit.Status(dep.CurrentStatus),
IsSatisfied: dep.IsSatisfied,
})
}
return SyncStatusResponse{
UnitName: unitName,
Status: unit.Status(resp.Status),
IsReady: resp.IsReady,
Dependencies: dependencies,
}, nil
}
// SyncStatusResponse contains the status information for a unit.
type SyncStatusResponse struct {
UnitName unit.ID `table:"unit,default_sort" json:"unit_name"`
Status unit.Status `table:"status" json:"status"`
IsReady bool `table:"ready" json:"is_ready"`
Dependencies []DependencyInfo `table:"dependencies" json:"dependencies"`
}
// DependencyInfo contains information about a unit dependency.
type DependencyInfo struct {
DependsOn unit.ID `table:"depends on,default_sort" json:"depends_on"`
RequiredStatus unit.Status `table:"required status" json:"required_status"`
CurrentStatus unit.Status `table:"current status" json:"current_status"`
IsSatisfied bool `table:"satisfied" json:"is_satisfied"`
}
+11 -58
View File
@@ -7,8 +7,6 @@ import (
"sync"
"golang.org/x/xerrors"
"github.com/hashicorp/yamux"
"storj.io/drpc/drpcmux"
"storj.io/drpc/drpcserver"
@@ -33,11 +31,17 @@ type Server struct {
wg sync.WaitGroup
}
func NewServer(path string, logger slog.Logger) (*Server, error) {
// NewServer creates a new agent socket server.
func NewServer(logger slog.Logger, opts ...Option) (*Server, error) {
options := &options{}
for _, opt := range opts {
opt(options)
}
logger = logger.Named("agentsocket-server")
server := &Server{
logger: logger,
path: path,
path: options.path,
service: &DRPCAgentSocketService{
logger: logger,
unitManager: unit.NewManager(),
@@ -61,14 +65,6 @@ func NewServer(path string, logger slog.Logger) (*Server, error) {
},
})
if server.path == "" {
var err error
server.path, err = getDefaultSocketPath()
if err != nil {
return nil, xerrors.Errorf("get default socket path: %w", err)
}
}
listener, err := createSocket(server.path)
if err != nil {
return nil, xerrors.Errorf("create socket: %w", err)
@@ -91,6 +87,7 @@ func NewServer(path string, logger slog.Logger) (*Server, error) {
return server, nil
}
// Close stops the server and cleans up resources.
func (s *Server) Close() error {
s.mu.Lock()
@@ -134,52 +131,8 @@ func (s *Server) acceptConnections() {
return
}
for {
select {
case <-s.ctx.Done():
return
default:
}
conn, err := listener.Accept()
err := s.drpcServer.Serve(s.ctx, listener)
if err != nil {
s.logger.Warn(s.ctx, "error accepting connection", slog.Error(err))
continue
}
s.mu.Lock()
if s.listener == nil {
s.mu.Unlock()
_ = conn.Close()
return
}
s.wg.Add(1)
s.mu.Unlock()
go func() {
defer s.wg.Done()
s.handleConnection(conn)
}()
}
}
func (s *Server) handleConnection(conn net.Conn) {
defer conn.Close()
s.logger.Debug(s.ctx, "new connection accepted", slog.F("remote_addr", conn.RemoteAddr()))
config := yamux.DefaultConfig()
config.LogOutput = nil
config.Logger = slog.Stdlib(s.ctx, s.logger.Named("agentsocket-yamux"), slog.LevelInfo)
session, err := yamux.Server(conn, config)
if err != nil {
s.logger.Warn(s.ctx, "failed to create yamux session", slog.Error(err))
return
}
defer session.Close()
err = s.drpcServer.Serve(s.ctx, session)
if err != nil {
s.logger.Debug(s.ctx, "drpc server finished", slog.Error(err))
s.logger.Warn(s.ctx, "error serving drpc server", slog.Error(err))
}
}
+90 -4
View File
@@ -1,14 +1,24 @@
package agentsocket_test
import (
"context"
"path/filepath"
"runtime"
"testing"
"github.com/google/uuid"
"github.com/spf13/afero"
"github.com/stretchr/testify/require"
"cdr.dev/slog"
"github.com/coder/coder/v2/agent"
"github.com/coder/coder/v2/agent/agentsocket"
"github.com/coder/coder/v2/agent/agenttest"
agentproto "github.com/coder/coder/v2/agent/proto"
"github.com/coder/coder/v2/codersdk/agentsdk"
"github.com/coder/coder/v2/tailnet"
"github.com/coder/coder/v2/tailnet/tailnettest"
"github.com/coder/coder/v2/testutil"
)
func TestServer(t *testing.T) {
@@ -23,7 +33,7 @@ func TestServer(t *testing.T) {
socketPath := filepath.Join(t.TempDir(), "test.sock")
logger := slog.Make().Leveled(slog.LevelDebug)
server, err := agentsocket.NewServer(socketPath, logger)
server, err := agentsocket.NewServer(logger, agentsocket.WithPath(socketPath))
require.NoError(t, err)
require.NoError(t, server.Close())
})
@@ -33,10 +43,10 @@ func TestServer(t *testing.T) {
socketPath := filepath.Join(t.TempDir(), "test.sock")
logger := slog.Make().Leveled(slog.LevelDebug)
server1, err := agentsocket.NewServer(socketPath, logger)
server1, err := agentsocket.NewServer(logger, agentsocket.WithPath(socketPath))
require.NoError(t, err)
defer server1.Close()
_, err = agentsocket.NewServer(socketPath, logger)
_, err = agentsocket.NewServer(logger, agentsocket.WithPath(socketPath))
require.ErrorContains(t, err, "create socket")
})
@@ -45,8 +55,84 @@ func TestServer(t *testing.T) {
socketPath := filepath.Join(t.TempDir(), "test.sock")
logger := slog.Make().Leveled(slog.LevelDebug)
server, err := agentsocket.NewServer(socketPath, logger)
server, err := agentsocket.NewServer(logger, agentsocket.WithPath(socketPath))
require.NoError(t, err)
require.NoError(t, server.Close())
})
}
func TestServerWindowsNotSupported(t *testing.T) {
t.Parallel()
if runtime.GOOS != "windows" {
t.Skip("this test only runs on Windows")
}
t.Run("NewServer", func(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(t.TempDir(), "test.sock")
logger := slog.Make().Leveled(slog.LevelDebug)
_, err := agentsocket.NewServer(logger, agentsocket.WithPath(socketPath))
require.ErrorContains(t, err, "agentsocket is not supported on Windows")
})
t.Run("NewClient", func(t *testing.T) {
t.Parallel()
_, err := agentsocket.NewClient(context.Background(), agentsocket.WithPath("test.sock"))
require.ErrorContains(t, err, "agentsocket is not supported on Windows")
})
}
func TestAgentInitializesOnWindowsWithoutSocketServer(t *testing.T) {
t.Parallel()
if runtime.GOOS != "windows" {
t.Skip("this test only runs on Windows")
}
ctx := testutil.Context(t, testutil.WaitShort)
logger := testutil.Logger(t).Named("agent")
derpMap, _ := tailnettest.RunDERPAndSTUN(t)
coordinator := tailnet.NewCoordinator(logger)
t.Cleanup(func() {
_ = coordinator.Close()
})
statsCh := make(chan *agentproto.Stats, 50)
agentID := uuid.New()
manifest := agentsdk.Manifest{
AgentID: agentID,
AgentName: "test-agent",
WorkspaceName: "test-workspace",
OwnerName: "test-user",
WorkspaceID: uuid.New(),
DERPMap: derpMap,
}
client := agenttest.NewClient(t, logger.Named("agenttest"), agentID, manifest, statsCh, coordinator)
t.Cleanup(client.Close)
options := agent.Options{
Client: client,
Filesystem: afero.NewMemMapFs(),
Logger: logger.Named("agent"),
ReconnectingPTYTimeout: testutil.WaitShort,
EnvironmentVariables: map[string]string{},
SocketPath: "",
}
agnt := agent.New(options)
t.Cleanup(func() {
_ = agnt.Close()
})
startup := testutil.TryReceive(ctx, t, client.GetStartup())
require.NotNil(t, startup, "agent should send startup message")
err := agnt.Close()
require.NoError(t, err, "agent should close cleanly")
}
+12 -2
View File
@@ -15,15 +15,18 @@ var _ proto.DRPCAgentSocketServer = (*DRPCAgentSocketService)(nil)
var ErrUnitManagerNotAvailable = xerrors.New("unit manager not available")
// DRPCAgentSocketService implements the DRPC agent socket service.
type DRPCAgentSocketService struct {
unitManager *unit.Manager
logger slog.Logger
}
// Ping responds to a ping request to check if the service is alive.
func (*DRPCAgentSocketService) Ping(_ context.Context, _ *proto.PingRequest) (*proto.PingResponse, error) {
return &proto.PingResponse{}, nil
}
// SyncStart starts a unit in the dependency graph.
func (s *DRPCAgentSocketService) SyncStart(_ context.Context, req *proto.SyncStartRequest) (*proto.SyncStartResponse, error) {
if s.unitManager == nil {
return nil, xerrors.Errorf("SyncStart: %w", ErrUnitManagerNotAvailable)
@@ -53,6 +56,7 @@ func (s *DRPCAgentSocketService) SyncStart(_ context.Context, req *proto.SyncSta
return &proto.SyncStartResponse{}, nil
}
// SyncWant declares a dependency between units.
func (s *DRPCAgentSocketService) SyncWant(_ context.Context, req *proto.SyncWantRequest) (*proto.SyncWantResponse, error) {
if s.unitManager == nil {
return nil, xerrors.Errorf("cannot add dependency: %w", ErrUnitManagerNotAvailable)
@@ -72,6 +76,7 @@ func (s *DRPCAgentSocketService) SyncWant(_ context.Context, req *proto.SyncWant
return &proto.SyncWantResponse{}, nil
}
// SyncComplete marks a unit as complete in the dependency graph.
func (s *DRPCAgentSocketService) SyncComplete(_ context.Context, req *proto.SyncCompleteRequest) (*proto.SyncCompleteResponse, error) {
if s.unitManager == nil {
return nil, xerrors.Errorf("cannot complete unit: %w", ErrUnitManagerNotAvailable)
@@ -86,6 +91,7 @@ func (s *DRPCAgentSocketService) SyncComplete(_ context.Context, req *proto.Sync
return &proto.SyncCompleteResponse{}, nil
}
// SyncReady checks whether a unit is ready to be started. That is, all dependencies are satisfied.
func (s *DRPCAgentSocketService) SyncReady(_ context.Context, req *proto.SyncReadyRequest) (*proto.SyncReadyResponse, error) {
if s.unitManager == nil {
return nil, xerrors.Errorf("cannot check readiness: %w", ErrUnitManagerNotAvailable)
@@ -102,6 +108,7 @@ func (s *DRPCAgentSocketService) SyncReady(_ context.Context, req *proto.SyncRea
}, nil
}
// SyncStatus gets the status of a unit and lists its dependencies.
func (s *DRPCAgentSocketService) SyncStatus(_ context.Context, req *proto.SyncStatusRequest) (*proto.SyncStatusResponse, error) {
if s.unitManager == nil {
return nil, xerrors.Errorf("cannot get status for unit %q: %w", req.Unit, ErrUnitManagerNotAvailable)
@@ -115,8 +122,11 @@ func (s *DRPCAgentSocketService) SyncStatus(_ context.Context, req *proto.SyncSt
}
dependencies, err := s.unitManager.GetAllDependencies(unitID)
if err != nil {
return nil, xerrors.Errorf("failed to get dependencies: %w", err)
switch {
case errors.Is(err, unit.ErrUnitNotFound):
dependencies = []unit.Dependency{}
case err != nil:
return nil, xerrors.Errorf("cannot get dependencies: %w", err)
}
var depInfos []*proto.DependencyInfo
+92 -173
View File
@@ -5,21 +5,18 @@ import (
"crypto/sha256"
"encoding/hex"
"fmt"
"net"
"os"
"path/filepath"
"runtime"
"testing"
"github.com/hashicorp/yamux"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"cdr.dev/slog"
"github.com/coder/coder/v2/agent/agentsocket"
"github.com/coder/coder/v2/agent/agentsocket/proto"
"github.com/coder/coder/v2/agent/unit"
"github.com/coder/coder/v2/codersdk/drpcsdk"
"github.com/coder/coder/v2/testutil"
)
// tempDirUnixSocket returns a temporary directory that can safely hold unix
@@ -47,23 +44,15 @@ func tempDirUnixSocket(t *testing.T) string {
}
// newSocketClient creates a DRPC client connected to the Unix socket at the given path.
func newSocketClient(t *testing.T, socketPath string) proto.DRPCAgentSocketClient {
func newSocketClient(ctx context.Context, t *testing.T, socketPath string) *agentsocket.Client {
t.Helper()
conn, err := net.Dial("unix", socketPath)
require.NoError(t, err)
config := yamux.DefaultConfig()
config.Logger = nil
session, err := yamux.Client(conn, config)
require.NoError(t, err)
client := proto.NewDRPCAgentSocketClient(drpcsdk.MultiplexedConn(session))
client, err := agentsocket.NewClient(ctx, agentsocket.WithPath(socketPath))
t.Cleanup(func() {
_ = session.Close()
_ = conn.Close()
_ = client.Close()
})
require.NoError(t, err)
return client
}
@@ -78,17 +67,17 @@ func TestDRPCAgentSocketService(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
_, err = client.Ping(context.Background(), &proto.PingRequest{})
err = client.Ping(ctx)
require.NoError(t, err)
})
@@ -98,147 +87,116 @@ func TestDRPCAgentSocketService(t *testing.T) {
t.Run("NewUnit", func(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
_, err = client.SyncStart(context.Background(), &proto.SyncStartRequest{
Unit: "test-unit",
})
err = client.SyncStart(ctx, "test-unit")
require.NoError(t, err)
status, err := client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err := client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, "started", status.Status)
require.Equal(t, unit.StatusStarted, status.Status)
})
t.Run("UnitAlreadyStarted", func(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
// First Start
_, err = client.SyncStart(context.Background(), &proto.SyncStartRequest{
Unit: "test-unit",
})
err = client.SyncStart(ctx, "test-unit")
require.NoError(t, err)
status, err := client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err := client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, "started", status.Status)
require.Equal(t, unit.StatusStarted, status.Status)
// Second Start
_, err = client.SyncStart(context.Background(), &proto.SyncStartRequest{
Unit: "test-unit",
})
err = client.SyncStart(ctx, "test-unit")
require.ErrorContains(t, err, unit.ErrSameStatusAlreadySet.Error())
status, err = client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err = client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, "started", status.Status)
require.Equal(t, unit.StatusStarted, status.Status)
})
t.Run("UnitAlreadyCompleted", func(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
// First start
_, err = client.SyncStart(context.Background(), &proto.SyncStartRequest{
Unit: "test-unit",
})
err = client.SyncStart(ctx, "test-unit")
require.NoError(t, err)
status, err := client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err := client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, "started", status.Status)
require.Equal(t, unit.StatusStarted, status.Status)
// Complete the unit
_, err = client.SyncComplete(context.Background(), &proto.SyncCompleteRequest{
Unit: "test-unit",
})
err = client.SyncComplete(ctx, "test-unit")
require.NoError(t, err)
status, err = client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err = client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, "completed", status.Status)
require.Equal(t, unit.StatusComplete, status.Status)
// Second start
_, err = client.SyncStart(context.Background(), &proto.SyncStartRequest{
Unit: "test-unit",
})
err = client.SyncStart(ctx, "test-unit")
require.NoError(t, err)
status, err = client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err = client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, "started", status.Status)
require.Equal(t, unit.StatusStarted, status.Status)
})
t.Run("UnitNotReady", func(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
_, err = client.SyncWant(context.Background(), &proto.SyncWantRequest{
Unit: "test-unit",
DependsOn: "dependency-unit",
})
err = client.SyncWant(ctx, "test-unit", "dependency-unit")
require.NoError(t, err)
_, err = client.SyncStart(context.Background(), &proto.SyncStartRequest{
Unit: "test-unit",
})
err = client.SyncStart(ctx, "test-unit")
require.ErrorContains(t, err, "unit not ready")
status, err := client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err := client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, string(unit.StatusPending), status.Status)
require.Equal(t, unit.StatusPending, status.Status)
require.False(t, status.IsReady)
})
})
@@ -250,107 +208,86 @@ func TestDRPCAgentSocketService(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
// If dependency units are not registered, they are registered automatically
_, err = client.SyncWant(context.Background(), &proto.SyncWantRequest{
Unit: "test-unit",
DependsOn: "dependency-unit",
})
err = client.SyncWant(ctx, "test-unit", "dependency-unit")
require.NoError(t, err)
status, err := client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err := client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Len(t, status.Dependencies, 1)
require.Equal(t, "dependency-unit", status.Dependencies[0].DependsOn)
require.Equal(t, "completed", status.Dependencies[0].RequiredStatus)
require.Equal(t, unit.ID("dependency-unit"), status.Dependencies[0].DependsOn)
require.Equal(t, unit.StatusComplete, status.Dependencies[0].RequiredStatus)
})
t.Run("DependencyAlreadyRegistered", func(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
// Start the dependency unit
_, err = client.SyncStart(context.Background(), &proto.SyncStartRequest{
Unit: "dependency-unit",
})
err = client.SyncStart(ctx, "dependency-unit")
require.NoError(t, err)
status, err := client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "dependency-unit",
})
status, err := client.SyncStatus(ctx, "dependency-unit")
require.NoError(t, err)
require.Equal(t, "started", status.Status)
require.Equal(t, unit.StatusStarted, status.Status)
// Add the dependency after the dependency unit has already started
_, err = client.SyncWant(context.Background(), &proto.SyncWantRequest{
Unit: "test-unit",
DependsOn: "dependency-unit",
})
err = client.SyncWant(ctx, "test-unit", "dependency-unit")
// Dependencies can be added even if the dependency unit has already started
require.NoError(t, err)
// The dependency is now reflected in the test unit's status
status, err = client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err = client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, "dependency-unit", status.Dependencies[0].DependsOn)
require.Equal(t, "completed", status.Dependencies[0].RequiredStatus)
require.Equal(t, unit.ID("dependency-unit"), status.Dependencies[0].DependsOn)
require.Equal(t, unit.StatusComplete, status.Dependencies[0].RequiredStatus)
})
t.Run("DependencyAddedAfterDependentStarted", func(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
// Start the dependent unit
_, err = client.SyncStart(context.Background(), &proto.SyncStartRequest{
Unit: "test-unit",
})
err = client.SyncStart(ctx, "test-unit")
require.NoError(t, err)
status, err := client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err := client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, "started", status.Status)
require.Equal(t, unit.StatusStarted, status.Status)
// Add the dependency after the dependency unit has already started
_, err = client.SyncWant(context.Background(), &proto.SyncWantRequest{
Unit: "test-unit",
DependsOn: "dependency-unit",
})
err = client.SyncWant(ctx, "test-unit", "dependency-unit")
// Dependencies can be added even if the dependent unit has already started.
// The dependency applies the next time a unit is started. The current status is not updated.
@@ -359,12 +296,10 @@ func TestDRPCAgentSocketService(t *testing.T) {
require.NoError(t, err)
// The dependency is now reflected in the test unit's status
status, err = client.SyncStatus(context.Background(), &proto.SyncStatusRequest{
Unit: "test-unit",
})
status, err = client.SyncStatus(ctx, "test-unit")
require.NoError(t, err)
require.Equal(t, "dependency-unit", status.Dependencies[0].DependsOn)
require.Equal(t, "completed", status.Dependencies[0].RequiredStatus)
require.Equal(t, unit.ID("dependency-unit"), status.Dependencies[0].DependsOn)
require.Equal(t, unit.StatusComplete, status.Dependencies[0].RequiredStatus)
})
})
@@ -375,96 +310,80 @@ func TestDRPCAgentSocketService(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
response, err := client.SyncReady(context.Background(), &proto.SyncReadyRequest{
Unit: "unregistered-unit",
})
ready, err := client.SyncReady(ctx, "unregistered-unit")
require.NoError(t, err)
require.False(t, response.Ready)
require.True(t, ready)
})
t.Run("UnitNotReady", func(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
// Register a unit with an unsatisfied dependency
_, err = client.SyncWant(context.Background(), &proto.SyncWantRequest{
Unit: "test-unit",
DependsOn: "dependency-unit",
})
err = client.SyncWant(ctx, "test-unit", "dependency-unit")
require.NoError(t, err)
// Check readiness - should be false because dependency is not satisfied
response, err := client.SyncReady(context.Background(), &proto.SyncReadyRequest{
Unit: "test-unit",
})
ready, err := client.SyncReady(ctx, "test-unit")
require.NoError(t, err)
require.False(t, response.Ready)
require.False(t, ready)
})
t.Run("UnitReady", func(t *testing.T) {
t.Parallel()
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
ctx := testutil.Context(t, testutil.WaitShort)
server, err := agentsocket.NewServer(
socketPath,
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err)
defer server.Close()
client := newSocketClient(t, socketPath)
client := newSocketClient(ctx, t, socketPath)
// Register a unit with no dependencies - should be ready immediately
_, err = client.SyncStart(context.Background(), &proto.SyncStartRequest{
Unit: "test-unit",
})
err = client.SyncStart(ctx, "test-unit")
require.NoError(t, err)
// Check readiness - should be true
_, err = client.SyncReady(context.Background(), &proto.SyncReadyRequest{
Unit: "test-unit",
})
ready, err := client.SyncReady(ctx, "test-unit")
require.NoError(t, err)
require.True(t, ready)
// Also test a unit with satisfied dependencies
_, err = client.SyncWant(context.Background(), &proto.SyncWantRequest{
Unit: "dependent-unit",
DependsOn: "test-unit",
})
err = client.SyncWant(ctx, "dependent-unit", "test-unit")
require.NoError(t, err)
// Complete the dependency
_, err = client.SyncComplete(context.Background(), &proto.SyncCompleteRequest{
Unit: "test-unit",
})
err = client.SyncComplete(ctx, "test-unit")
require.NoError(t, err)
// Now dependent-unit should be ready
_, err = client.SyncReady(context.Background(), &proto.SyncReadyRequest{
Unit: "dependent-unit",
})
ready, err = client.SyncReady(ctx, "dependent-unit")
require.NoError(t, err)
require.True(t, ready)
})
})
}
+17 -27
View File
@@ -3,8 +3,7 @@
package agentsocket
import (
"crypto/rand"
"encoding/hex"
"context"
"net"
"os"
"path/filepath"
@@ -13,8 +12,13 @@ import (
"golang.org/x/xerrors"
)
// createSocket creates a Unix domain socket listener
const defaultSocketPath = "/tmp/coder-agent.sock"
func createSocket(path string) (net.Listener, error) {
if path == "" {
path = defaultSocketPath
}
if !isSocketAvailable(path) {
return nil, xerrors.Errorf("socket path %s is not available", path)
}
@@ -23,7 +27,6 @@ func createSocket(path string) (net.Listener, error) {
return nil, xerrors.Errorf("remove existing socket: %w", err)
}
// Create parent directory if it doesn't exist
parentDir := filepath.Dir(path)
if err := os.MkdirAll(parentDir, 0o700); err != nil {
return nil, xerrors.Errorf("create socket directory: %w", err)
@@ -41,43 +44,30 @@ func createSocket(path string) (net.Listener, error) {
return listener, nil
}
// getDefaultSocketPath returns the default socket path for Unix-like systems
func getDefaultSocketPath() (string, error) {
randomBytes := make([]byte, 4)
if _, err := rand.Read(randomBytes); err != nil {
return "", xerrors.Errorf("generate random socket name: %w", err)
}
randomSuffix := hex.EncodeToString(randomBytes)
// Try XDG_RUNTIME_DIR first
if runtimeDir := os.Getenv("XDG_RUNTIME_DIR"); runtimeDir != "" {
return filepath.Join(runtimeDir, "coder-agent-"+randomSuffix+".sock"), nil
}
return filepath.Join("/tmp", "coder-agent-"+randomSuffix+".sock"), nil
}
// CleanupSocket removes the socket file
func cleanupSocket(path string) error {
return os.Remove(path)
}
// isSocketAvailable checks if a socket path is available for use
func isSocketAvailable(path string) bool {
// Check if file exists
if _, err := os.Stat(path); os.IsNotExist(err) {
return true
}
// Try to connect to see if it's actually listening
// Try to connect to see if it's actually listening.
dialer := net.Dialer{Timeout: 10 * time.Second}
conn, err := dialer.Dial("unix", path)
if err != nil {
// If we can't connect, the socket is not in use
// Socket is available for use
return true
}
_ = conn.Close()
// Socket is in use
return false
}
func dialSocket(ctx context.Context, path string) (net.Conn, error) {
if path == "" {
path = defaultSocketPath
}
dialer := net.Dialer{}
return dialer.DialContext(ctx, "unix", path)
}
+5 -10
View File
@@ -3,25 +3,20 @@
package agentsocket
import (
"context"
"net"
"golang.org/x/xerrors"
)
// createSocket returns an error indicating that agentsocket is not supported on Windows.
// This feature is unix-only in its current experimental state.
func createSocket(_ string) (net.Listener, error) {
return nil, xerrors.New("agentsocket is not supported on Windows")
}
// getDefaultSocketPath returns an error indicating that agentsocket is not supported on Windows.
// This feature is unix-only in its current experimental state.
func getDefaultSocketPath() (string, error) {
return "", xerrors.New("agentsocket is not supported on Windows")
}
// cleanupSocket is a no-op on Windows since agentsocket is not supported.
func cleanupSocket(_ string) error {
// No-op since agentsocket is not supported on Windows
return nil
}
func dialSocket(_ context.Context, _ string) (net.Conn, error) {
return nil, xerrors.New("agentsocket is not supported on Windows")
}
+11 -1
View File
@@ -2,6 +2,7 @@ package unit
import (
"errors"
"fmt"
"sync"
"golang.org/x/xerrors"
@@ -23,6 +24,15 @@ var (
// Status represents the status of a unit.
type Status string
var _ fmt.Stringer = Status("")
func (s Status) String() string {
if s == StatusNotRegistered {
return "not registered"
}
return string(s)
}
// Status constants for dependency tracking.
const (
StatusNotRegistered Status = ""
@@ -137,7 +147,7 @@ func (m *Manager) IsReady(id ID) (bool, error) {
defer m.mu.RUnlock()
if !m.registered(id) {
return false, nil
return true, nil
}
return m.units[id].ready, nil
+1 -1
View File
@@ -684,7 +684,7 @@ func TestManager_IsReady(t *testing.T) {
// Then: the unit is not ready
isReady, err := manager.IsReady(unitA)
require.NoError(t, err)
assert.False(t, isReady)
assert.True(t, isReady)
})
}
+17
View File
@@ -57,6 +57,8 @@ func workspaceAgent() *serpent.Command {
devcontainers bool
devcontainerProjectDiscovery bool
devcontainerDiscoveryAutostart bool
socketServerEnabled bool
socketPath string
)
agentAuth := &AgentAuth{}
cmd := &serpent.Command{
@@ -317,6 +319,8 @@ func workspaceAgent() *serpent.Command {
agentcontainers.WithProjectDiscovery(devcontainerProjectDiscovery),
agentcontainers.WithDiscoveryAutostart(devcontainerDiscoveryAutostart),
},
SocketPath: socketPath,
SocketServerEnabled: socketServerEnabled,
})
if debugAddress != "" {
@@ -477,6 +481,19 @@ func workspaceAgent() *serpent.Command {
Description: "Allow the agent to autostart devcontainer projects it discovers based on their configuration.",
Value: serpent.BoolOf(&devcontainerDiscoveryAutostart),
},
{
Flag: "socket-server-enabled",
Default: "false",
Env: "CODER_AGENT_SOCKET_SERVER_ENABLED",
Description: "Enable the agent socket server.",
Value: serpent.BoolOf(&socketServerEnabled),
},
{
Flag: "socket-path",
Env: "CODER_AGENT_SOCKET_PATH",
Description: "Specify the path for the agent socket.",
Value: serpent.StringOf(&socketPath),
},
}
agentAuth.AttachOptions(cmd, false)
return cmd
+1
View File
@@ -150,6 +150,7 @@ func (r *RootCmd) AGPLExperimental() []*serpent.Command {
r.mcpCommand(),
r.promptExample(),
r.rptyCommand(),
r.syncCommand(),
r.boundary(),
}
}
+25
View File
@@ -72,6 +72,31 @@ func TestCommandHelp(t *testing.T) {
Name: "coder provisioner jobs list --output json",
Cmd: []string{"provisioner", "jobs", "list", "--output", "json"},
},
// TODO (SasSwart): Remove these once the sync commands are promoted out of experimental.
clitest.CommandHelpCase{
Name: "coder exp sync --help",
Cmd: []string{"exp", "sync", "--help"},
},
clitest.CommandHelpCase{
Name: "coder exp sync ping --help",
Cmd: []string{"exp", "sync", "ping", "--help"},
},
clitest.CommandHelpCase{
Name: "coder exp sync start --help",
Cmd: []string{"exp", "sync", "start", "--help"},
},
clitest.CommandHelpCase{
Name: "coder exp sync want --help",
Cmd: []string{"exp", "sync", "want", "--help"},
},
clitest.CommandHelpCase{
Name: "coder exp sync complete --help",
Cmd: []string{"exp", "sync", "complete", "--help"},
},
clitest.CommandHelpCase{
Name: "coder exp sync status --help",
Cmd: []string{"exp", "sync", "status", "--help"},
},
))
}
+35
View File
@@ -0,0 +1,35 @@
package cli
import (
"github.com/coder/serpent"
)
func (r *RootCmd) syncCommand() *serpent.Command {
var socketPath string
cmd := &serpent.Command{
Use: "sync",
Short: "Manage unit dependencies for coordinated startup",
Long: "Commands for orchestrating unit startup order in workspaces. Units are most commonly coder scripts. Use these commands to declare dependencies between units, coordinate their startup sequence, and ensure units start only after their dependencies are ready. This helps prevent race conditions and startup failures.",
Handler: func(i *serpent.Invocation) error {
return i.Command.HelpHandler(i)
},
Children: []*serpent.Command{
r.syncPing(&socketPath),
r.syncStart(&socketPath),
r.syncWant(&socketPath),
r.syncComplete(&socketPath),
r.syncStatus(&socketPath),
},
Options: serpent.OptionSet{
{
Flag: "socket-path",
Env: "CODER_AGENT_SOCKET_PATH",
Description: "Specify the path for the agent socket.",
Value: serpent.StringOf(&socketPath),
},
},
}
return cmd
}
+47
View File
@@ -0,0 +1,47 @@
package cli
import (
"golang.org/x/xerrors"
"github.com/coder/coder/v2/agent/agentsocket"
"github.com/coder/coder/v2/agent/unit"
"github.com/coder/coder/v2/cli/cliui"
"github.com/coder/serpent"
)
func (*RootCmd) syncComplete(socketPath *string) *serpent.Command {
cmd := &serpent.Command{
Use: "complete <unit>",
Short: "Mark a unit as complete",
Long: "Mark a unit as complete. Indicating to other units that it has completed its work. This allows units that depend on it to proceed with their startup.",
Handler: func(i *serpent.Invocation) error {
ctx := i.Context()
if len(i.Args) != 1 {
return xerrors.New("exactly one unit name is required")
}
unit := unit.ID(i.Args[0])
opts := []agentsocket.Option{}
if *socketPath != "" {
opts = append(opts, agentsocket.WithPath(*socketPath))
}
client, err := agentsocket.NewClient(ctx, opts...)
if err != nil {
return xerrors.Errorf("connect to agent socket: %w", err)
}
defer client.Close()
if err := client.SyncComplete(ctx, unit); err != nil {
return xerrors.Errorf("complete unit failed: %w", err)
}
cliui.Info(i.Stdout, "Success")
return nil
},
}
return cmd
}
+42
View File
@@ -0,0 +1,42 @@
package cli
import (
"golang.org/x/xerrors"
"github.com/coder/coder/v2/agent/agentsocket"
"github.com/coder/coder/v2/cli/cliui"
"github.com/coder/serpent"
)
func (*RootCmd) syncPing(socketPath *string) *serpent.Command {
cmd := &serpent.Command{
Use: "ping",
Short: "Test agent socket connectivity and health",
Long: "Test connectivity to the local Coder agent socket to verify the agent is running and responsive. Useful for troubleshooting startup issues or verifying the agent is accessible before running other sync commands.",
Handler: func(i *serpent.Invocation) error {
ctx := i.Context()
opts := []agentsocket.Option{}
if *socketPath != "" {
opts = append(opts, agentsocket.WithPath(*socketPath))
}
client, err := agentsocket.NewClient(ctx, opts...)
if err != nil {
return xerrors.Errorf("connect to agent socket: %w", err)
}
defer client.Close()
err = client.Ping(ctx)
if err != nil {
return xerrors.Errorf("ping failed: %w", err)
}
cliui.Info(i.Stdout, "Success")
return nil
},
}
return cmd
}
+101
View File
@@ -0,0 +1,101 @@
package cli
import (
"context"
"time"
"golang.org/x/xerrors"
"github.com/coder/serpent"
"github.com/coder/coder/v2/agent/agentsocket"
"github.com/coder/coder/v2/agent/unit"
"github.com/coder/coder/v2/cli/cliui"
)
const (
syncPollInterval = 1 * time.Second
)
func (*RootCmd) syncStart(socketPath *string) *serpent.Command {
var timeout time.Duration
cmd := &serpent.Command{
Use: "start <unit>",
Short: "Wait until all unit dependencies are satisfied",
Long: "Wait until all dependencies are satisfied, consider the unit to have started, then allow it to proceed. This command polls until dependencies are ready, then marks the unit as started.",
Handler: func(i *serpent.Invocation) error {
ctx := i.Context()
if len(i.Args) != 1 {
return xerrors.New("exactly one unit name is required")
}
unitName := unit.ID(i.Args[0])
if timeout > 0 {
var cancel context.CancelFunc
ctx, cancel = context.WithTimeout(ctx, timeout)
defer cancel()
}
opts := []agentsocket.Option{}
if *socketPath != "" {
opts = append(opts, agentsocket.WithPath(*socketPath))
}
client, err := agentsocket.NewClient(ctx, opts...)
if err != nil {
return xerrors.Errorf("connect to agent socket: %w", err)
}
defer client.Close()
ready, err := client.SyncReady(ctx, unitName)
if err != nil {
return xerrors.Errorf("error checking dependencies: %w", err)
}
if !ready {
cliui.Infof(i.Stdout, "Waiting for dependencies of unit '%s' to be satisfied...", unitName)
ticker := time.NewTicker(syncPollInterval)
defer ticker.Stop()
pollLoop:
for {
select {
case <-ctx.Done():
if ctx.Err() == context.DeadlineExceeded {
return xerrors.Errorf("timeout waiting for dependencies of unit '%s'", unitName)
}
return ctx.Err()
case <-ticker.C:
ready, err := client.SyncReady(ctx, unitName)
if err != nil {
return xerrors.Errorf("error checking dependencies: %w", err)
}
if ready {
break pollLoop
}
}
}
}
if err := client.SyncStart(ctx, unitName); err != nil {
return xerrors.Errorf("start unit failed: %w", err)
}
cliui.Info(i.Stdout, "Success")
return nil
},
}
cmd.Options = append(cmd.Options, serpent.Option{
Flag: "timeout",
Description: "Maximum time to wait for dependencies (e.g., 30s, 5m). 5m by default.",
Value: serpent.DurationOf(&timeout),
Default: "5m",
})
return cmd
}
+88
View File
@@ -0,0 +1,88 @@
package cli
import (
"fmt"
"golang.org/x/xerrors"
"github.com/coder/serpent"
"github.com/coder/coder/v2/agent/agentsocket"
"github.com/coder/coder/v2/agent/unit"
"github.com/coder/coder/v2/cli/cliui"
)
func (*RootCmd) syncStatus(socketPath *string) *serpent.Command {
formatter := cliui.NewOutputFormatter(
cliui.ChangeFormatterData(
cliui.TableFormat(
[]agentsocket.DependencyInfo{},
[]string{
"depends on",
"required status",
"current status",
"satisfied",
},
),
func(data any) (any, error) {
resp, ok := data.(agentsocket.SyncStatusResponse)
if !ok {
return nil, xerrors.Errorf("expected agentsocket.SyncStatusResponse, got %T", data)
}
return resp.Dependencies, nil
}),
cliui.JSONFormat(),
)
cmd := &serpent.Command{
Use: "status <unit>",
Short: "Show unit status and dependency state",
Long: "Show the current status of a unit, whether it is ready to start, and lists its dependencies. Shows which dependencies are satisfied and which are still pending. Supports multiple output formats.",
Handler: func(i *serpent.Invocation) error {
ctx := i.Context()
if len(i.Args) != 1 {
return xerrors.New("exactly one unit name is required")
}
unit := unit.ID(i.Args[0])
opts := []agentsocket.Option{}
if *socketPath != "" {
opts = append(opts, agentsocket.WithPath(*socketPath))
}
client, err := agentsocket.NewClient(ctx, opts...)
if err != nil {
return xerrors.Errorf("connect to agent socket: %w", err)
}
defer client.Close()
statusResp, err := client.SyncStatus(ctx, unit)
if err != nil {
return xerrors.Errorf("get status failed: %w", err)
}
var out string
header := fmt.Sprintf("Unit: %s\nStatus: %s\nReady: %t\n\nDependencies:\n", unit, statusResp.Status, statusResp.IsReady)
if formatter.FormatID() == "table" && len(statusResp.Dependencies) == 0 {
out = header + "No dependencies found"
} else {
out, err = formatter.Format(ctx, statusResp)
if err != nil {
return xerrors.Errorf("format status: %w", err)
}
if formatter.FormatID() == "table" {
out = header + out
}
}
_, _ = fmt.Fprintln(i.Stdout, out)
return nil
},
}
formatter.AttachOptions(&cmd.Options)
return cmd
}
+330
View File
@@ -0,0 +1,330 @@
//go:build !windows
package cli_test
import (
"bytes"
"context"
"os"
"path/filepath"
"testing"
"time"
"github.com/stretchr/testify/require"
"cdr.dev/slog"
"github.com/coder/coder/v2/agent/agentsocket"
"github.com/coder/coder/v2/cli/clitest"
"github.com/coder/coder/v2/testutil"
)
// setupSocketServer creates an agentsocket server at a temporary path for testing.
// Returns the socket path and a cleanup function. The path should be passed to
// sync commands via the --socket-path flag.
func setupSocketServer(t *testing.T) (path string, cleanup func()) {
t.Helper()
// Use a temporary socket path for each test
socketPath := filepath.Join(tempDirUnixSocket(t), "test.sock")
// Create parent directory if needed
parentDir := filepath.Dir(socketPath)
err := os.MkdirAll(parentDir, 0o700)
require.NoError(t, err, "create socket directory")
server, err := agentsocket.NewServer(
slog.Make().Leveled(slog.LevelDebug),
agentsocket.WithPath(socketPath),
)
require.NoError(t, err, "create socket server")
// Return cleanup function
return socketPath, func() {
err := server.Close()
require.NoError(t, err, "close socket server")
_ = os.Remove(socketPath)
}
}
func TestSyncCommands_Golden(t *testing.T) {
t.Parallel()
t.Run("ping", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "ping", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
err := inv.WithContext(ctx).Run()
require.NoError(t, err)
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/ping_success", outBuf.Bytes(), nil)
})
t.Run("start_no_dependencies", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "start", "test-unit", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
err := inv.WithContext(ctx).Run()
require.NoError(t, err)
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/start_no_dependencies", outBuf.Bytes(), nil)
})
t.Run("start_with_dependencies", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
// Set up dependency: test-unit depends on dep-unit
client, err := agentsocket.NewClient(ctx, agentsocket.WithPath(path))
require.NoError(t, err)
// Declare dependency
err = client.SyncWant(ctx, "test-unit", "dep-unit")
require.NoError(t, err)
client.Close()
// Start a goroutine to complete the dependency after a short delay
// This simulates the dependency being satisfied while start is waiting
// The delay ensures the "Waiting..." message appears in the output
done := make(chan error, 1)
go func() {
// Wait a moment to let the start command begin waiting and print the message
time.Sleep(100 * time.Millisecond)
compCtx := context.Background()
compClient, err := agentsocket.NewClient(compCtx, agentsocket.WithPath(path))
if err != nil {
done <- err
return
}
defer compClient.Close()
// Start and complete the dependency unit
err = compClient.SyncStart(compCtx, "dep-unit")
if err != nil {
done <- err
return
}
err = compClient.SyncComplete(compCtx, "dep-unit")
done <- err
}()
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "start", "test-unit", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
// Run the start command - it should wait for the dependency
err = inv.WithContext(ctx).Run()
require.NoError(t, err)
// Ensure the completion goroutine finished
select {
case err := <-done:
require.NoError(t, err, "complete dependency")
case <-time.After(time.Second):
// Goroutine should have finished by now
}
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/start_with_dependencies", outBuf.Bytes(), nil)
})
t.Run("want", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "want", "test-unit", "dep-unit", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
err := inv.WithContext(ctx).Run()
require.NoError(t, err)
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/want_success", outBuf.Bytes(), nil)
})
t.Run("complete", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
// First start the unit
client, err := agentsocket.NewClient(ctx, agentsocket.WithPath(path))
require.NoError(t, err)
err = client.SyncStart(ctx, "test-unit")
require.NoError(t, err)
client.Close()
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "complete", "test-unit", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
err = inv.WithContext(ctx).Run()
require.NoError(t, err)
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/complete_success", outBuf.Bytes(), nil)
})
t.Run("status_pending", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
// Set up a unit with unsatisfied dependency
client, err := agentsocket.NewClient(ctx, agentsocket.WithPath(path))
require.NoError(t, err)
err = client.SyncWant(ctx, "test-unit", "dep-unit")
require.NoError(t, err)
client.Close()
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "status", "test-unit", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
err = inv.WithContext(ctx).Run()
require.NoError(t, err)
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/status_pending", outBuf.Bytes(), nil)
})
t.Run("status_started", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
// Start a unit
client, err := agentsocket.NewClient(ctx, agentsocket.WithPath(path))
require.NoError(t, err)
err = client.SyncStart(ctx, "test-unit")
require.NoError(t, err)
client.Close()
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "status", "test-unit", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
err = inv.WithContext(ctx).Run()
require.NoError(t, err)
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/status_started", outBuf.Bytes(), nil)
})
t.Run("status_completed", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
// Start and complete a unit
client, err := agentsocket.NewClient(ctx, agentsocket.WithPath(path))
require.NoError(t, err)
err = client.SyncStart(ctx, "test-unit")
require.NoError(t, err)
err = client.SyncComplete(ctx, "test-unit")
require.NoError(t, err)
client.Close()
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "status", "test-unit", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
err = inv.WithContext(ctx).Run()
require.NoError(t, err)
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/status_completed", outBuf.Bytes(), nil)
})
t.Run("status_with_dependencies", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
// Set up a unit with dependencies, some satisfied, some not
client, err := agentsocket.NewClient(ctx, agentsocket.WithPath(path))
require.NoError(t, err)
err = client.SyncWant(ctx, "test-unit", "dep-1")
require.NoError(t, err)
err = client.SyncWant(ctx, "test-unit", "dep-2")
require.NoError(t, err)
// Complete dep-1, leave dep-2 incomplete
err = client.SyncStart(ctx, "dep-1")
require.NoError(t, err)
err = client.SyncComplete(ctx, "dep-1")
require.NoError(t, err)
client.Close()
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "status", "test-unit", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
err = inv.WithContext(ctx).Run()
require.NoError(t, err)
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/status_with_dependencies", outBuf.Bytes(), nil)
})
t.Run("status_json_format", func(t *testing.T) {
t.Parallel()
path, cleanup := setupSocketServer(t)
defer cleanup()
ctx := testutil.Context(t, testutil.WaitShort)
// Set up a unit with dependencies
client, err := agentsocket.NewClient(ctx, agentsocket.WithPath(path))
require.NoError(t, err)
err = client.SyncWant(ctx, "test-unit", "dep-unit")
require.NoError(t, err)
err = client.SyncStart(ctx, "dep-unit")
require.NoError(t, err)
err = client.SyncComplete(ctx, "dep-unit")
require.NoError(t, err)
client.Close()
var outBuf bytes.Buffer
inv, _ := clitest.New(t, "exp", "sync", "status", "test-unit", "--output", "json", "--socket-path", path)
inv.Stdout = &outBuf
inv.Stderr = &outBuf
err = inv.WithContext(ctx).Run()
require.NoError(t, err)
clitest.TestGoldenFile(t, "TestSyncCommands_Golden/status_json_format", outBuf.Bytes(), nil)
})
}
+49
View File
@@ -0,0 +1,49 @@
package cli
import (
"golang.org/x/xerrors"
"github.com/coder/serpent"
"github.com/coder/coder/v2/agent/agentsocket"
"github.com/coder/coder/v2/agent/unit"
"github.com/coder/coder/v2/cli/cliui"
)
func (*RootCmd) syncWant(socketPath *string) *serpent.Command {
cmd := &serpent.Command{
Use: "want <unit> <depends-on>",
Short: "Declare that a unit depends on another unit completing before it can start",
Long: "Declare that a unit depends on another unit completing before it can start. The unit specified first will not start until the second has signaled that it has completed.",
Handler: func(i *serpent.Invocation) error {
ctx := i.Context()
if len(i.Args) != 2 {
return xerrors.New("exactly two arguments are required: unit and depends-on")
}
dependentUnit := unit.ID(i.Args[0])
dependsOn := unit.ID(i.Args[1])
opts := []agentsocket.Option{}
if *socketPath != "" {
opts = append(opts, agentsocket.WithPath(*socketPath))
}
client, err := agentsocket.NewClient(ctx, opts...)
if err != nil {
return xerrors.Errorf("connect to agent socket: %w", err)
}
defer client.Close()
if err := client.SyncWant(ctx, dependentUnit, dependsOn); err != nil {
return xerrors.Errorf("declare dependency failed: %w", err)
}
cliui.Info(i.Stdout, "Success")
return nil
},
}
return cmd
}
@@ -0,0 +1 @@
Success
@@ -0,0 +1 @@
Success
@@ -0,0 +1 @@
Success
@@ -0,0 +1,2 @@
Waiting for dependencies of unit 'test-unit' to be satisfied...
Success
@@ -0,0 +1,6 @@
Unit: test-unit
Status: completed
Ready: true
Dependencies:
No dependencies found
@@ -0,0 +1,13 @@
{
"unit_name": "test-unit",
"status": "pending",
"is_ready": true,
"dependencies": [
{
"depends_on": "dep-unit",
"required_status": "completed",
"current_status": "completed",
"is_satisfied": true
}
]
}
@@ -0,0 +1,7 @@
Unit: test-unit
Status: pending
Ready: false
Dependencies:
DEPENDS ON REQUIRED STATUS CURRENT STATUS SATISFIED
dep-unit completed not registered false
@@ -0,0 +1,6 @@
Unit: test-unit
Status: started
Ready: true
Dependencies:
No dependencies found
@@ -0,0 +1,8 @@
Unit: test-unit
Status: pending
Ready: false
Dependencies:
DEPENDS ON REQUIRED STATUS CURRENT STATUS SATISFIED
dep-1 completed completed true
dep-2 completed not registered false
@@ -0,0 +1 @@
Success
+6
View File
@@ -67,6 +67,12 @@ OPTIONS:
--script-data-dir string, $CODER_AGENT_SCRIPT_DATA_DIR (default: /tmp)
Specify the location for storing script data.
--socket-path string, $CODER_AGENT_SOCKET_PATH
Specify the path for the agent socket.
--socket-server-enabled bool, $CODER_AGENT_SOCKET_SERVER_ENABLED (default: false)
Enable the agent socket server.
--ssh-max-timeout duration, $CODER_AGENT_SSH_MAX_TIMEOUT (default: 72h)
Specify the max timeout for a SSH connection, it is advisable to set
it to a minimum of 60s, but no more than 72h.
+27
View File
@@ -0,0 +1,27 @@
coder v0.0.0-devel
USAGE:
coder exp sync [flags]
Manage unit dependencies for coordinated startup
Commands for orchestrating unit startup order in workspaces. Units are most
commonly coder scripts. Use these commands to declare dependencies between
units, coordinate their startup sequence, and ensure units start only after
their dependencies are ready. This helps prevent race conditions and startup
failures.
SUBCOMMANDS:
complete Mark a unit as complete
ping Test agent socket connectivity and health
start Wait until all unit dependencies are satisfied
status Show unit status and dependency state
want Declare that a unit depends on another unit completing before it
can start
OPTIONS:
--socket-path string, $CODER_AGENT_SOCKET_PATH
Specify the path for the agent socket.
———
Run `coder --help` for a list of global options.
+12
View File
@@ -0,0 +1,12 @@
coder v0.0.0-devel
USAGE:
coder exp sync complete <unit>
Mark a unit as complete
Mark a unit as complete. Indicating to other units that it has completed its
work. This allows units that depend on it to proceed with their startup.
———
Run `coder --help` for a list of global options.
+13
View File
@@ -0,0 +1,13 @@
coder v0.0.0-devel
USAGE:
coder exp sync ping
Test agent socket connectivity and health
Test connectivity to the local Coder agent socket to verify the agent is
running and responsive. Useful for troubleshooting startup issues or verifying
the agent is accessible before running other sync commands.
———
Run `coder --help` for a list of global options.
+17
View File
@@ -0,0 +1,17 @@
coder v0.0.0-devel
USAGE:
coder exp sync start [flags] <unit>
Wait until all unit dependencies are satisfied
Wait until all dependencies are satisfied, consider the unit to have started,
then allow it to proceed. This command polls until dependencies are ready,
then marks the unit as started.
OPTIONS:
--timeout duration (default: 5m)
Maximum time to wait for dependencies (e.g., 30s, 5m). 5m by default.
———
Run `coder --help` for a list of global options.
+20
View File
@@ -0,0 +1,20 @@
coder v0.0.0-devel
USAGE:
coder exp sync status [flags] <unit>
Show unit status and dependency state
Show the current status of a unit, whether it is ready to start, and lists its
dependencies. Shows which dependencies are satisfied and which are still
pending. Supports multiple output formats.
OPTIONS:
-c, --column [depends on|required status|current status|satisfied] (default: depends on,required status,current status,satisfied)
Columns to display in table output.
-o, --output table|json (default: table)
Output format.
———
Run `coder --help` for a list of global options.
+13
View File
@@ -0,0 +1,13 @@
coder v0.0.0-devel
USAGE:
coder exp sync want <unit> <depends-on>
Declare that a unit depends on another unit completing before it can start
Declare that a unit depends on another unit completing before it can start.
The unit specified first will not start until the second has signaled that it
has completed.
———
Run `coder --help` for a list of global options.