mirror of
https://github.com/coder/coder.git
synced 2026-06-02 20:48:20 +00:00
de9cdca77e
## Summary Make Coder's chat agent honest about workspaces that use `coder_external_agent`. Three behaviors change so the chat stops pretending it can drive an external workspace through to a usable state on its own. <img width="859" height="537" alt="image" src="https://github.com/user-attachments/assets/0561442b-95f1-4a2d-853c-7e3776114680" /> ## Problem External agents are not started by Coder. The user has to run `coder agent` on their own host with a token Coder generates. Before this change, the chat agent treated those workspaces like any other: - `create_workspace` would enqueue a build for an external-agent template and then wait minutes (~22 worst case) for an agent that was never going to come up. - When mid-turn tool calls dialed an external agent that was not connected, the chat burned the full 30-second dial timeout and returned generic "the workspace may need to be restarted from the Coder dashboard" guidance, which is not the action the user can take. - Nothing told the chat (or the user, through the chat) that the next action lives outside Coder. ## Fix Three changes scoped to `coderd/x/chatd/`: 1. **`create_workspace` blocks templates with external agents.** The tool reads `template_versions.has_external_agent` for the template's active version and refuses external-agent templates with a message instructing the chat to pick a different template, or to have the user create and start the workspace themselves and then attach it. 2. **Attaching an existing external workspace stays open.** No selection-time gate on attachment; users can still bind a working external workspace to a chat. 3. **External-agent-aware error handling on connection.** Two complementary changes both predicated on proven connectivity failures rather than every dial error: - **`getWorkspaceConn` preflight and timeout handling.** Before opening a connection, the cache-miss path reads the agent's status from the already-loaded row. If the selected agent is external and clearly offline according to the existing `isAgentUnreachable` helper (`Disconnected` or `Timeout`, never `Connecting`), it returns an external-agent-specific error immediately instead of waiting out the 30-second dial timeout. `Connecting` external agents fall through to the dial so a user who just started the agent on their host can still succeed in the same turn. The preflight only fires when the agent is still the latest selected agent for the workspace, so stale-binding recovery via `dialWithLazyValidation` is unaffected. The post-dial rewrite is limited to the dial timeout sentinel; stale/no-agent bindings and non-timeout dial failures preserve their original errors. - **`waitForAgentReady` timeout-branch rewrite.** The 2-minute retry loop used by `create_workspace` and `start_workspace` runs unchanged for all agents. When the loop's outer deadline elapses, the timeout branch substitutes the external-agent message in place of the raw dial error if the agent belongs to an external resource. This applies the same pattern that the cache-hit path of `getWorkspaceConn` already used (`isAgentUnreachable` returning `errChatAgentDisconnected`), extended to the cache-miss path and to the readiness helper, with the external-agent-aware error rewrite layered only on confirmed offline or timeout paths. Closes CODAGT-314
48 lines
1.8 KiB
Go
48 lines
1.8 KiB
Go
package chattool
|
|
|
|
import (
|
|
"context"
|
|
|
|
"github.com/google/uuid"
|
|
|
|
"github.com/coder/coder/v2/coderd/database"
|
|
)
|
|
|
|
// ExternalAgentResourceType is the Terraform resource type for externally
|
|
// managed agents.
|
|
const ExternalAgentResourceType = "coder_external_agent"
|
|
|
|
const createWorkspaceExternalAgentMessage = "create_workspace cannot create workspaces from templates with externally managed agents. " +
|
|
"Use list_templates to choose a different template, or if the user wants " +
|
|
"to use an external workspace, they should create it and start it up fully " +
|
|
"themselves first, then attach it to this chat"
|
|
|
|
const externalAgentNotConnectedMessage = "workspace uses an externally managed agent that has not connected yet. " +
|
|
"The user needs to start the workspace externally and make sure the " +
|
|
"external agent is connected, then try again"
|
|
|
|
const externalAgentDisconnectedMessage = "workspace uses an externally managed agent that is currently offline. " +
|
|
"The user needs to reconnect the external agent on its host, then try again"
|
|
|
|
// ExternalAgentUnavailableMessage explains how to make an externally managed
|
|
// agent usable based on its connection history.
|
|
func ExternalAgentUnavailableMessage(agent database.WorkspaceAgent) string {
|
|
if agent.FirstConnectedAt.Valid {
|
|
return externalAgentDisconnectedMessage
|
|
}
|
|
return externalAgentNotConnectedMessage
|
|
}
|
|
|
|
// IsExternalWorkspaceAgent reports whether agent belongs to an external
|
|
// resource.
|
|
func IsExternalWorkspaceAgent(ctx context.Context, db database.Store, agent database.WorkspaceAgent) (bool, error) {
|
|
if db == nil || agent.ResourceID == uuid.Nil {
|
|
return false, nil
|
|
}
|
|
resource, err := db.GetWorkspaceResourceByID(ctx, agent.ResourceID)
|
|
if err != nil {
|
|
return false, err
|
|
}
|
|
return resource.Type == ExternalAgentResourceType, nil
|
|
}
|