mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Files

T

Michael Suchacz cb37047dce feat: dedicated /prompts endpoint for chat history cycle (#25083 )

Follow-up to #25004. The merged change cycles only through messages
already loaded in the in-memory chat store (page size 50). Long chats
and chats whose oldest turns have rolled out of the page lose access to
their earlier prompts in the composer's up/down arrow cycle. This PR
adds a dedicated server endpoint that returns the full prompt history,
newest first, and rewires the composer to use it.

## What changed

### Endpoint

`GET /api/experimental/chats/{chat}/prompts?limit=N`

```go
type ChatPrompt struct { ID int64; Text string }
type ChatPromptsResponse struct { Prompts []ChatPrompt }
```

- `limit`: `0..2000`. `0` (the default) is treated as the server-side
default of 500; out-of-range values return `400`. Negative values are
rejected by the SDK's `PositiveInt32` parser before reaching the
handler.
- Auth: parent-chat read in `dbauthz`, mirroring
`GetChatMessagesByChatID`.
- The SQL filters `role='user'`, `deleted=false`, `visibility IN
('user','both')`, guards the lateral with `jsonb_typeof(content) =
'array'` so legacy V0 scalar-string rows are silently skipped, then
unrolls `content` JSONB with `WITH ORDINALITY` and concatenates only
`type='text'` parts in original order via `string_agg(... ORDER BY
ordinality)`. Messages whose joined text is whitespace-only are dropped
via `HAVING ... ~ '\S'` so cycling never lands on a blank entry.

### Partial index (migration `000494`)

```sql
CREATE INDEX idx_chat_messages_user_prompts
ON chat_messages (chat_id, id DESC)
WHERE deleted = false
  AND role = 'user'
  AND visibility IN ('user', 'both');
```

The partial WHERE matches the query's filter exactly and the key order
matches `ORDER BY id DESC`, so the planner gets both the filter and the
ordering from the index without a sort step.

`EXPLAIN ANALYZE` on a synthetic 51-chat × 5,000-message dataset (≈260k
rows, 10k user prompts in the target chat, `random_page_cost=1.1`):

| | Plan | Buffers hit | Time |
|---|---|---|---|
| Without index | `Index Scan Backward using chat_messages_pkey`,
**250,848 rows removed by filter** | 6,683 | 32.4 ms |
| With index | `Index Scan using idx_chat_messages_user_prompts`, no
filter | 38 | 1.3 ms |

≈25× faster, 175× fewer buffer hits.

### Frontend

- `chatPromptsKey` / `chatPromptsQuery` factories in
`site/src/api/queries/chats.ts` (`staleTime: 30s`, `enabled: chatId !==
""`, asks the server for 500 prompts).
- `ChatPageContent.tsx` replaces the in-memory derivation with
`useQuery(chatPromptsQuery(chatId ?? ""))`. The composer's existing
`cycleHistorySnapshotRef` anchors the in-flight cycle so a refetch
arriving mid-cycle cannot shift the indexed prompt out from under the
user.
- `getEditableUserMessagePayload` now concatenates user-message text
parts verbatim, mirroring the server's `string_agg(part->>'text', ''
ORDER BY ordinality)`, instead of routing through the streaming-oriented
`parseMessageContent` / `appendText` pipeline (which drops
whitespace-only chunks — correct for assistant streams, wrong for a
user's persisted message). This keeps the cycle and the edit path in
agreement on the same message. File blocks are still pulled separately
via
`parseMessageContent(...).blocks.filter(isEditableUserMessageFileBlock)`.
- Cache invalidation in `createChatMessage.onSuccess`,
`editChatMessage.onSettled`, and `useChatStore.upsertCacheMessages`
(only when an upserted message has `role === "user"`).
- Page-level stories pre-seed `chatPromptsKey(CHAT_ID)` from the same
`messagesData` to keep them offline.

## Tests

- New `TestGetChatUserPrompts` in `coderd/exp_chats_test.go` with five
subtests:
- `NewestFirstFiltering` — multi-part concatenation, non-text parts
skipped, whitespace-only filtered, soft-deleted excluded, `model`-only
visibility excluded, assistant-role excluded by `cm.role = 'user'`,
legacy V0 scalar row silently excluded by the `jsonb_typeof` guard,
ordering newest first.
- `LimitClampsResults` — explicit `limit=2` returns the two newest
prompts.
  - `InvalidLimitRejected` — `limit=5000` is `400 Bad Request`.
- `NotFoundForOtherUsers` — a separate user in the same org gets `404`,
not the prompts.
- `EmptyResultIsJSONArray` — zero-message chat and assistant-only chat
both return `Prompts: []` (non-nil, empty).
- New unit test in `messageParsing.test.ts` asserting that
`getEditableUserMessagePayload(["hello", " ", "world"])` returns `"hello
world"`, locking in the agreement with the SQL `string_agg`.
- `dbauthz_test.go` adds the
`MethodTestSuite.TestChats/GetChatUserPromptsByChatID` entry, asserting
parent-chat `policy.ActionRead`.
- `pnpm test src/pages/AgentsPage` — 1159 passed, 2 skipped.
- `make gen` produces no diff.

## Manual verification

Seeded a dev chat with Claude Sonnet 4.6 via the aibridge Anthropic
provider and posted 20 user prompts end-to-end. Verified that the
`/prompts` endpoint returns 20 rows newest-first, that `limit=10` clamps
correctly, that `limit=0` uses the server default of 500, and that the
up/down keyboard cycle in the composer walks the same sequence (and
reverses correctly back to the empty draft).

## Out of scope

- Cross-chat history.
- Per-user opt-out for the cycle.
- File-reference / attachment cycling — the cycle continues to reproduce
plain text only, by design.

<details>
<summary>Implementation plan</summary>

# CODAGT-319 Follow-up — Dedicated `/prompts` endpoint

## Context

The merged feature ([#25004](https://github.com/coder/coder/pull/25004)
/ [d32842f](https://github.com/coder/coder/commit/d32842f)) cycles only
through messages already loaded in the in-memory chat store, which is
capped at the first 50 messages of the current page. Long chats and
chats whose oldest turns have rolled out of the page can no longer
recall their full prompt history. This follow-up exposes a dedicated
server endpoint that returns the user-authored prompts in a chat, newest
first, and rewires the composer to use it.

## Design

### Endpoint

`GET /api/experimental/chats/{chat}/prompts?limit=N`

Returns:

```go
type ChatPrompt struct {
    ID   int64
    Text string
}
type ChatPromptsResponse struct {
    Prompts []ChatPrompt
}
```

- `limit`: `0..2000`. `0` (the default) → server-side default of 500.
The wire-level default is encoded in SQL as `COALESCE(NULLIF($limit, 0),
500)`. Negatives are rejected upstream by `PositiveInt32`; the handler
only caps the upper bound.
- Auth: parent-chat read in `dbauthz`, mirroring
`GetChatMessagesByChatID`.
- Listed under the experimental router so we can iterate without API
guarantees.

### SQL

The query lives in `coderd/database/queries/chats.sql` as
`GetChatUserPromptsByChatID`:

- Filters `role='user'`, `deleted=false`, `visibility IN
('user','both')` to mirror the composer's "what the user actually typed
and can re-send" contract.
- Guards the lateral with `jsonb_typeof(content) = 'array'` so legacy V0
rows whose content is a scalar JSON string (predates migration `000434`)
are silently excluded instead of raising `"cannot extract elements from
a scalar"`.
- Unrolls `content` JSONB with `jsonb_array_elements WITH ORDINALITY`
and concatenates only `type='text'` parts, preserving original order via
`string_agg(... ORDER BY ordinality)`.
- Casts the result to `text` so sqlc emits a `string` field instead of
`[]byte`.
- Drops whitespace-only prompts via `HAVING string_agg(...) ~ '\S'` so
cycling never lands on a blank entry.
- Orders by `cm.id DESC` (`id` is a sequence, so this is "newest first"
without relying on `created_at`).

### Index

New partial index added in migration `000494`:

```sql
CREATE INDEX idx_chat_messages_user_prompts
ON chat_messages (chat_id, id DESC)
WHERE deleted = false
  AND role = 'user'
  AND visibility IN ('user', 'both');
```

The partial WHERE clause matches the query's filter exactly, so the
planner can use the index for both filtering and ordering without a sort
step.

### Frontend

- `chatPromptsKey(chatId)` and `chatPromptsQuery(chatId)` factories in
`site/src/api/queries/chats.ts`. `staleTime: 30s`, `enabled: chatId !==
""`. Asks the server for 500 prompts (well below the 2000 max, plenty
for the cycle).
- `ChatPageContent.tsx` replaces the in-memory derivation with
`useQuery(chatPromptsQuery(chatId ?? ""))`. The composer's
`cycleHistorySnapshotRef` already takes a stable snapshot at cycle
entry, so a refetch arriving mid-cycle cannot shift the indexed prompt
out from under the user.
- `getEditableUserMessagePayload` extracts the edit-path text from raw
user-message parts (filter `type === "text"`, join verbatim) instead of
going through `parseMessageContent` / `appendText`, which is built for
assistant streams and intentionally drops whitespace-only chunks.
Without this, cycling and clicking Edit on the same message could
produce different draft text for messages with whitespace-only
interleaved text parts.
- Cache invalidation: `createChatMessage.onSuccess`,
`editChatMessage.onSettled`, and `useChatStore.upsertCacheMessages`
(when at least one upserted message has `role === "user"`) all
invalidate `chatPromptsKey(chatId)`.

### Tests

- `TestGetChatUserPrompts` (`coderd/exp_chats_test.go`) covers:
- `NewestFirstFiltering` — multi-part concatenation, non-text parts
skipped, whitespace-only filtered, soft-deleted excluded, `model`-only
visibility excluded, assistant-role excluded by `cm.role = 'user'`,
legacy V0 scalar row silently excluded by the `jsonb_typeof` guard,
ordering newest first.
- `LimitClampsResults` — explicit `limit=2` returns the two newest
prompts.
  - `InvalidLimitRejected` — `limit=5000` is `400 Bad Request`.
- `NotFoundForOtherUsers` — a separate user in the same org gets `404`,
not the prompts.
- `EmptyResultIsJSONArray` — zero-message chat and assistant-only chat
both return `Prompts: []` (non-nil, empty).
- `messageParsing.test.ts` adds a unit test asserting that
`getEditableUserMessagePayload(["hello", " ", "world"])` returns `"hello
world"`, locking in the agreement with the SQL `string_agg`.
- `dbauthz_test.go` adds the
`MethodTestSuite.TestChats/GetChatUserPromptsByChatID` entry, asserting
the parent-chat `policy.ActionRead`.

## Out of scope

- Cross-chat history.
- Per-user opt-out for the cycle.
- File-reference / attachment cycling — the cycle still reproduces plain
text only, by design.

</details>

<details>
<summary>coder-agents-review history</summary>

Four review rounds, eight unique findings, all addressed in this PR
(approved twice). Rebased onto `main` twice after R4: first to pick up
new migrations `000491` / `000492`, then again for
`000493_idx_chat_diff_statuses_url_lower`. The prompts-index migration
was renumbered `000491 → 000493 → 000494` via
`coderd/database/migrations/fix_migration_numbers.sh`; no other diff
changes.

| Round | Head | Outcome |
|---|---|---|
| R1 | `725422ab` | `COMMENTED` — 7 findings (DEREM-1..7) |
| R2 | `ab2a8936` | `COMMENTED` — 1 new (DEREM-10) + 1 reraised
(DEREM-5) |
| R3 | `648c5d1f` | **`APPROVED`** — 7 fixed, DEREM-5 deferred via
#25125 |
| R4 | `93b6f450` | **`APPROVED`** — DEREM-5 also fixed in-PR, #25125
closed |

| ID | Where | Resolution |
|---|---|---|
| DEREM-1 | `chats.sql` | Added `jsonb_typeof(content) = 'array'` guard
against V0 scalar rows |
| DEREM-2 | `exp_chats.go` | Removed dead `limit < 0` branch (SDK
rejects upstream) |
| DEREM-3 | `useChatStore.ts` | Rewrote misleading invalidation comment
|
| DEREM-4 | `exp_chats_test.go` | `NewestFirstFiltering` now inserts an
assistant-role message so the `role='user'` filter is exercised
end-to-end |
| DEREM-5 | `messageParsing.ts` | Rewrote
`getEditableUserMessagePayload` to concatenate text parts verbatim,
mirroring the SQL `string_agg` |
| DEREM-6 | `exp_chats.go` | Tightened swagger doc + error message to
spell out the 0–2000 range |
| DEREM-7 | `exp_chats_test.go` | Added `EmptyResultIsJSONArray` subtest
|
| DEREM-10 | `exp_chats_test.go` | `NewestFirstFiltering` now inserts a
raw V0 scalar-content row; verified locally that removing the guard
makes the test fail |

</details>

---

This PR was created on behalf of @ibetitsmike by Coder Agents.

2026-05-14 12:43:12 +02:00

about

docs: add coder-templates skill references to quickstart and template contribution guides (#24383 )

2026-04-16 12:04:30 -05:00

admin

docs: mention making the GitHub App public and APP_INSTALL_URL (#25188 )

2026-05-12 15:02:00 +00:00

ai-coder

refactor: remove agents TUI (#25190 )

2026-05-13 21:30:11 +02:00

images

docs: update screenshot to point to generic URL (#25314 )

2026-05-13 17:20:09 -04:00

install

docs: call out coder/skills setup skill on install and quickstart pages (#25194 )

2026-05-12 12:36:00 -05:00

reference

feat: dedicated /prompts endpoint for chat history cycle (#25083 )

2026-05-14 12:43:12 +02:00

start

docs: remove nested alerts (#18580 )

2025-06-25 15:17:49 +00:00

support

fix: allow member users to generate support bundles (#23040 )

2026-03-18 13:43:10 +00:00

tutorials

docs: call out coder/skills setup skill on install and quickstart pages (#25194 )

2026-05-12 12:36:00 -05:00

user-guides

docs: add early access user secrets guide (#24735 )

2026-04-28 22:25:45 +05:00

manifest.json

docs: update AI Governance label and add v2.32 requirement (#24708 )

2026-05-07 17:09:54 -05:00

README.md

refactor: remove agents TUI (#25190 )

2026-05-13 21:30:11 +02:00

README.md

About

Coder is a self-hosted platform for running AI coding agents and cloud development environments on infrastructure you control. It works with any cloud, IDE, OS, Git provider, and IDP.

Coder Workspaces

Coder Workspaces are cloud development environments defined with Terraform, connected through a secure Wireguard tunnel, and automatically shut down when not in use. Agents and developers share the same workspace infrastructure.

Defined in Terraform: Templates describe the infrastructure for each workspace, from EC2 VMs and Kubernetes Pods to Docker containers.
Any architecture and OS: Support ARM and x86-64 across Windows, Linux, and macOS from a single deployment.
Managed by admins: Platform teams create and maintain templates that enforce approved images, resource limits, and security policies.
Accessed from any IDE: Connect through VS Code, JetBrains, Cursor, a web terminal, remote desktop, or SSH.
Automatic shutdown: Idle workspaces stop automatically to reduce cloud spend, and restart in seconds when needed.

Coder Agents

Coder Agents is a native AI coding agent built into Coder. The agent loop runs in the Coder control plane on your infrastructure, not in the workspace and not in a vendor's cloud. Developers interact with agents through the web UI or the REST API for programmatic and CI-driven workflows.

Self-hosted agent loop: The control plane handles planning, model calls, and tool dispatch. Workspaces have zero AI awareness.
No API keys in workspaces: LLM credentials stay in the control plane.
Any model: Anthropic, OpenAI, Google, Bedrock, or self-hosted endpoints. Switching is a configuration change.
Governance and cost controls: Centralized model approval, per-user spend limits, and audit logging.
Open source and inspectable: The full platform is available to audit and extend.

IDE support

You can use:

Any Web IDE, such as
- code-server
- JetBrains Projector
- Jupyter
- And others
Your existing remote development environment:
A file sync such as Mutagen

Why remote development

Provisioning consistent development environments for a large engineering team is difficult. Each developer has preferences for operating systems, editors, and toolchains, and ensuring a reliable build environment across all of them is a maintenance burden. A missed step during onboarding or an unsupported local configuration can cost hours of debugging.

Remote development solves this by moving the environment off the developer's machine and into managed infrastructure. The developer's laptop becomes a portal into the actual compute where work happens. If a device is lost or replaced, access is simply revoked; no source code or credentials are stored locally.

This approach provides:

Speed: Server-grade hardware accelerates builds, tests, and large workloads without requiring expensive local machines.
Consistency: Infrastructure tools such as Terraform, nix, Docker, and Dev Containers produce identical environments for every developer.
Security: Source code stays on private servers. Users and groups are managed through SSO and RBAC.
Compatibility: Workspaces share infrastructure configurations with staging and production, reducing configuration drift.
Accessibility: Browser-based IDEs and remote IDE extensions let developers work from any device, including lightweight laptops, Chromebooks, and tablets.

Read more on the Coder blog, the Slack engineering blog, or from Alex Ellis at OpenFaaS.

Why Coder

The key difference between Coder and other platforms is that the entire system, agent loop, control plane, model routing, and workspace provisioning, runs on infrastructure you control.

For agents, this means platform teams can:

Run the entire agent loop on their infrastructure, with no SaaS dependency for orchestration.
Define MCP servers, skills, and system prompts centrally so every agent session starts with the same tools, policies, and context.
Keep LLM credentials out of workspaces entirely.
Tie every agent action to an authenticated user identity.
Support air-gapped and restricted-network deployments with self-hosted models.

For workspaces, this means admins can:

Support any architecture (ARM, x86-64) and operating system (Windows, Linux, macOS).
Modify pod/container specs, such as adding disks, managing network policies, or setting/updating environment variables.
Use VM or dedicated workspaces, developing with Kernel features (no container knowledge required).
Enable persistent workspaces, which are like local machines, but faster and hosted by a cloud service.

Pricing

Coder is free and open source under the GNU Affero General Public License v3.0. All developer productivity features are included in the open source version. A Premium license is available for enhanced support and custom deployments.

How Coder works

Coder workspaces are represented with Terraform, but you do not need to know Terraform to get started. The Coder Registry provides production-ready templates for AWS EC2, Azure, Google Cloud, Kubernetes, and other providers.

Providers and compute environments

Workspaces can include more than just compute. Terraform can add storage buckets, secrets, sidecars, and other resources.

See the templates documentation for details.

What Coder is not

Coder is not an infrastructure as code (IaC) platform.
- Terraform is the first IaC provisioner in Coder, allowing Coder admins to define Terraform resources as Coder workspaces.
Coder is not a DevOps/CI platform.
- Coder workspaces can be configured to follow best practices for cloud-service-based workloads, but Coder is not responsible for how you define or deploy the software you write.
Coder is not an online IDE.
- Coder supports common editors, such as VS Code, vim, and JetBrains, all over HTTPS or SSH.
Coder is not a collaboration platform.
- You can use Git with your favorite Git platform and dedicated IDE extensions for pull requests, code reviews, and pair programming.
Coder is not a SaaS/fully-managed offering.
- Coder is a self-hosted solution. You must host Coder in a private data center or on a cloud service, such as AWS, Azure, or GCP.

README.md

About

Coder Workspaces

Coder Agents

IDE support

Why remote development

Why Coder

Pricing

How Coder works

What Coder is not

Learn more