PRD: Coder Agents side questions with `/btw`

Problem Statement

Coder Agents users sometimes need to ask a quick contextual question about the current work without changing the main agent conversation, interrupting a long-running task, approving or modifying a plan, or adding noise to future model context. This is especially important while reviewing plans, answering agent questions, or monitoring a running agent. Today, the only way to ask is to send a normal chat message, which persists in the transcript, affects future prompt context, may queue or interrupt work, and can derail the agent.

Users want a Claude Code-style /btw command that answers a one-off side question from current chat context while keeping the main agent's work and conversation history untouched.

Solution

Add a Coder Agents side question feature. /btw is the user-facing slash command alias. A side question is a one-shot, user-facing, no-tools answer generated by chatd from the selected chat's effective persisted context plus a narrow, capped transient context for currently visible streaming assistant text. It never becomes a normal chat message, never affects future model context, and never mutates chat lifecycle state.

The backend exposes a dedicated side-question API. Web and CLI clients detect /btw at the beginning of the composer, call the side-question API, and render the result in a dismissible overlay. The answer disappears after dismissal or refresh. Side questions are metered through metadata-only auxiliary run records so they count toward usage limits and cost analytics without storing question text, answer text, or full prompts.

User Stories

As an agent user, I want to ask /btw questions about the current work, so that I can get quick clarification without altering the main conversation.
As an agent user reviewing a plan, I want to ask a side question about the plan, so that I can understand it before approving, rejecting, or responding.
As an agent user, I want side questions to be user-facing, so that the answer speaks directly to me rather than advising the agent.
As an agent user, I want side questions to avoid entering chat history, so that future turns are not polluted by my temporary question.
As an agent user, I want side-question answers to avoid entering chat history, so that the main transcript remains focused on durable work.
As an agent user, I want a side question to run while the agent is running, so that I do not need to stop or wait for long-running work.
As an agent user, I want a side question to run while the agent is pending, so that I can ask about queued or in-progress work without changing chat state.
As an agent user, I want a side question to run while the chat requires action, so that I can gather context before answering the agent.
As an agent user, I want a side question to be available on failed chats, so that I can ask what the visible context suggests happened.
As an agent user, I want side questions disabled on archived chats, so that archived chat behavior remains consistent with normal message sending.
As an agent user, I want side questions disabled before a chat exists, so that /btw does not create chats as a side effect.
As an agent user, I want side questions available on root chats, so that I can ask about the main agent's work.
As an agent user, I want side questions available on child chats, so that I can ask about a subagent's selected context.
As an agent user, I want side questions scoped to the selected chat, so that answers do not unexpectedly include parent or sibling chat context.
As an agent user, I want /btw to use the same effective model configuration as the chat, so that quality and cost are predictable.
As an agent user, I want /btw to avoid using tools, so that it cannot modify files, execute commands, call MCP tools, or affect workspace state.
As an agent user, I want /btw to avoid provider-native tools, so that a side question cannot browse, use computer-use, or perform external actions.
As an agent user, I want side questions to answer only from available context, so that I can trust they are not performing hidden investigation.
As an agent user, I want side questions to say when they do not know, so that they do not speculate beyond current context.
As an agent user, I want side questions to avoid revealing hidden instructions, so that internal system and developer instructions remain protected.
As an agent user, I want a side-question overlay to show loading, success, and error states, so that the interaction feels separate from the transcript.
As an agent user, I want to dismiss the side-question overlay, so that the temporary answer leaves my workspace when I am done with it.
As an agent user, I want dismissing a loading side question to cancel the request when possible, so that I can stop unnecessary model work.
As an agent user, I want side-question errors to appear in the overlay, so that failures do not add chat messages or disappear as unrelated toasts.
As an agent user, I want side questions to be one-shot with no overlay follow-up thread, so that the feature remains lightweight.
As an agent user, I want /btw questions excluded from normal prompt history, so that temporary side questions are not treated like durable chat prompts.
As an agent user, I want /btw slash detection to happen only at the start of the composer, so that normal messages mentioning /btw are not misrouted.
As an agent user, I want a literal escape for messages starting with /btw, so that I can discuss the command itself in normal chat.
As an agent user, I want currently visible streaming assistant text to be available to side questions, so that I can ask about text that has not been persisted yet.
As an agent user, I do not want queued messages included automatically in side-question context, so that answers reflect current work rather than future queued input.
As an agent user, I do not want unsent draft text included automatically, so that private or half-written draft content is not silently sent.
As an admin, I want side-question inference to count toward spend limits, so that users cannot bypass cost controls with /btw.
As an admin, I want side-question inference represented in cost analytics, so that model spend remains explainable.
As an admin, I want side-question records to store metadata only, so that ephemeral content is not retained by default.
As an admin, I want a category for side-question usage, so that normal assistant turns and side questions can be analyzed separately.
As an operator, I want a kill switch for side questions, so that the feature can be disabled if cost, prompt, or provider issues appear.
As an operator, I want one active side question per chat and user, so that side-channel inference cannot be spammed from multiple tabs or clients.
As an operator, I want stale side-question runs to unblock automatically, so that a crashed server does not permanently disable side questions for a chat.
As a security reviewer, I want side questions restricted to chat owners, so that readers or admins do not trigger inference using another user's context or credentials.
As a security reviewer, I want side questions to avoid storing prompt and answer content in debug logs by default, so that the ephemeral promise remains true.
As a support engineer, I want side-question responses to include a run identifier, model, and usage metadata, so that support can correlate issues without content retention.
As a CLI user, I want /btw support in the terminal TUI, so that the feature works where Claude Code users expect it.
As a web user, I want /btw support in the Agents page composer, so that the feature works in the browser experience.
As a reviewer, I want dogfooding evidence for web and CLI, so that I can verify the feature behaves as claimed.

Implementation Decisions

Use side question as the canonical domain term. /btw is the user-facing slash command alias only.
Add a dedicated side-question API under the experimental chat API. Do not overload normal message creation.
Add SDK request and response types for side questions. The request includes the question and optional capped transient context. The response includes answer, run identifier, model information, and usage information. The response does not need to expose per-run cost to end users unless existing product patterns require it.
Implement a chatd side-question runtime that resolves the selected chat, enforces owner-only access, enforces archived-chat rejection, resolves the same effective model configuration as the chat, builds the side-question prompt, runs a single no-tools model step, records auxiliary run metadata, and returns the answer.
Build side-question prompt context from persisted and effective context with strict parity where feasible. Include persisted model-visible history, existing compacted summaries, resolved chat files through the same assumptions as normal chat prompt building, system and user prompt behavior, plan-mode instructions, persisted context files, and persisted skills.
Do not refresh, discover, persist, or mutate context as part of a side-question request. No workspace instruction refresh, no workspace MCP discovery, no plan file writes, and no compaction side effects.
Include narrow transient context in v1 for currently visible streaming assistant text only. Backend caps and labels it clearly. Do not include arbitrary draft content, selected hidden state, queued messages, or server in-memory stream buffers.
Run no tools in side questions. This includes built-in tools, MCP tools, dynamic tools, provider-native tools, workspace tools, subagent tools, and web or computer-use provider tools.
Make the side-question prompt plan-aware. In plan mode, it can explain the current plan, risks, or meaning without approving, rejecting, editing, or producing hidden plan changes.
Instruct the model not to reveal hidden or internal instructions. Internal/control context can guide behavior, but the answer must not quote or summarize hidden prompts.
If the answer is not available from side-question context, instruct the model to say so briefly and not speculate.
Use synchronous API behavior in v1. Streaming side-question responses are a future enhancement.
Reset provider-side chain state and disable provider-side storage for nested side-question calls. Side questions must not become part of provider-side conversation state that can affect future normal turns.
Preserve a cache-friendly prompt shape but do not add explicit provider-specific prompt-cache behavior in v1.
Add a generic chat_auxiliary_runs storage concept for non-message chat-adjacent inference. Use kind = side_question for this feature.
Store auxiliary run metadata only. Do not store question text, answer text, full prompts, or rendered context by default.
Auxiliary run statuses are running, succeeded, failed, and canceled.
Use database-backed concurrency. Enforce one active side-question run per chat and owner. Stale running rows expire after 5 minutes.
Side-question inference counts toward usage limits and cost analytics. Analytics should preserve the side_question kind for future breakdowns.
Add metadata-only audit if the audit model has a clean action/resource fit. Do not force side questions into a misleading chat update action.
Do not include full prompt or answer content in debug logging for v1. Any full content capture requires a future explicit opt-in.
Side questions must not mutate chat messages, queued messages, chat status, chat title, chat recency, read cursor, notifications, unread state, diff state, workspace state, files, or provider chain state.
Side questions are allowed during all non-archived chat statuses, including running, pending, waiting, requires-action, and error.
Side questions are disabled for draft or new chats that do not yet have a server chat identifier.
Side questions are available on both root and child chats, scoped to the selected chat's context.
Client slash detection is client-side only. Detection triggers only when the trimmed composer starts with /btw and has a non-empty question. Messages that mention /btw elsewhere are normal chat messages.
Provide a literal escape for messages that should begin with /btw but be sent as normal chat messages.
Web and CLI clients render side-question results in dismissible overlays. The overlay is the only v1 UI persistence. Answers disappear on dismiss or refresh.
Dismissing the overlay while loading aborts the request when possible. Dismissing after completion only hides the answer.
Side-question questions are not inserted into normal prompt history.
Add a dedicated rollout or kill-switch configuration so operators can disable the feature.
Major modules to build or modify include the SDK chat API types, chat HTTP handlers, chatd side-question runtime, reusable effective prompt snapshot builder, auxiliary run storage and queries, usage-limit and cost-analytics aggregation, web composer and overlay UI, CLI slash command routing and overlay UI, and rollout configuration.
Deep modules worth extracting include a side-question runtime with a small options/result interface, an auxiliary run store that owns concurrency and metadata transitions, and an effective prompt snapshot builder that can be tested independently from the full chat loop.

Testing Decisions

Tests should focus on externally observable behavior and invariants rather than implementation details. A good test proves side questions do not create chat messages, do not affect future prompt context, do not mutate chat state, are metered, enforce permissions, and render correctly in clients.
Backend API tests should cover owner-only access, non-owner rejection, archived-chat rejection, draft/no-chat unavailability at the client boundary, validation errors, context caps, successful answer response shape, usage-limit behavior, and provider failure behavior.
Chatd tests should cover no chat message insertion, no status or recency mutation, no queue mutation, no tool exposure, provider chain reset, no provider-side storage, prompt construction boundaries, no queued-message inclusion, no automatic draft inclusion, transient-context inclusion, and clear context-overflow failure.
Auxiliary run tests should cover running/succeeded/failed/canceled transitions, metadata-only persistence, no content persistence, one-active-run conflict, stale running timeout, cancellation update, and cost/usage writes.
Cost and usage-limit tests should cover side-question spend included in limits and analytics, with the side-question kind preserved for breakdown.
Security tests should cover hidden-instruction non-disclosure at the prompt contract level where feasible, owner-only execution, and absence of full content in v1 debug storage.
SDK tests should cover request/response serialization and error handling for the new side-question API.
Web tests should cover slash detection only at composer start, literal escape behavior, draft chat rejection, overlay loading/success/error states, cancellation on dismiss, no transcript mutation, no prompt-history insertion, and capped transient context from visible streaming text.
CLI model/render tests should cover slash detection, draft rejection, overlay loading/success/error states, cancellation on dismiss, no normal message send, no transcript mutation, and no prompt-history insertion.
Prior art exists in the codebase for chat message send API tests, chat stream tests, tool result submission tests, prompt conversion tests, chatadvisor nested no-tools runtime tests, web chat input and chat store tests, and CLI agents TUI render/model tests.
Dogfooding should include backend API verification that messages remain unchanged, web screenshots and video showing overlay loading/success/error/dismiss behavior, and a terminal recording showing CLI /btw behavior while a chat is active.

Out of Scope

Follow-up turns inside the side-question overlay.
Side-question threads or server-side side-question answer recall.
Persisting question text, answer text, full prompt text, or rendered context by default.
Tool use of any kind, including read-only tools, MCP tools, dynamic tools, provider-native tools, web search, and computer use.
Provider-specific explicit prompt-cache controls in v1.
Streaming side-question responses in v1.
Model override controls for side questions in v1.
Side questions on draft or new chats without an existing chat identifier.
Including queued messages automatically in side-question context in v1.
Including unsent draft text automatically in side-question context.
Server-side notifications, unread state changes, read cursor changes, or chat recency changes.
Side-effect compaction, context refresh, workspace discovery, or workspace file access.
Full debug capture of side-question content without a future explicit opt-in.
Publishing the PRD to an issue tracker as part of this request.

Further Notes

The current chat architecture already has several useful building blocks: chat message visibility separates UI-visible and model-visible messages, chatd owns the normal send/message/run loop, and the existing advisor runtime demonstrates nested no-tools single-step model calls with provider chain reset. The side-question feature should reuse patterns from those areas while keeping a separate domain contract: a side question is not a chat message, not a subagent, not an advisor, and not a hidden transcript.

Implementation should be especially careful around prompt parity. The side-question answer needs enough of the selected chat's effective persisted context to feel reliable, but it must not perform refresh or discovery work that changes chat state or workspace state. If those two goals conflict, prefer non-mutation and fail clearly rather than silently changing context.

Manual dogfooding is part of the done bar. For the web app, run the development server, use the Agents page, trigger /btw while the chat is active, and capture screenshots and video. For the CLI, run the agents TUI, trigger /btw, and capture a terminal recording. Verify that the transcript, chat messages API, chat status, recency, and future normal turns do not include side-question content.

18 KiB Raw Permalink Blame History

PRD: Coder Agents side questions with /btw