mirror of
https://github.com/coder/coder.git
synced 2026-06-02 20:48:20 +00:00
chore: update AI Gateway docs (#24805)
> AI tools where used when creating this PR This PR: * removes references to aibridge repository from coder docs * updates aibdrige/README.md * makes it clear aibridge (keeping old name) is a handler not a separate process * updates outdated sections about: metrics, recorded interface and supported paths. --------- Co-authored-by: Susana Ferreira <susana@coder.com>
This commit is contained in:
committed by
GitHub
parent
0754016512
commit
6ea9c61da0
+38
-14
@@ -1,6 +1,8 @@
|
||||
# aibridge
|
||||
|
||||
aibridge is an HTTP gateway that sits between AI clients and upstream AI providers (Anthropic, OpenAI). It intercepts requests to record token usage, prompts, and tool invocations per user. Optionally supports centralized [MCP](https://modelcontextprotocol.io/) tool injection with allowlist/denylist filtering.
|
||||
aibridge provides an HTTP handler that intercepts AI client requests bound for upstream AI providers (Anthropic, OpenAI, Copilot). It records token usage, prompts, and tool invocations per user. Optionally supports centralized [MCP](https://modelcontextprotocol.io/) tool injection with allowlist/denylist filtering.
|
||||
|
||||
The handler is mounted by a host process. Today that host is `coderd`, which [mounts the handler](../enterprise/coderd/coderd.go#L294) at `/api/v2/aibridge/<provider>/*`. Running aibridge as a separate process is planned for the future.
|
||||
|
||||
## Architecture
|
||||
|
||||
@@ -38,7 +40,7 @@ aibridge is an HTTP gateway that sits between AI clients and upstream AI provide
|
||||
## Request Flow
|
||||
|
||||
1. Client sends request to `/anthropic/v1/messages` or `/openai/v1/chat/completions`
|
||||
2. **Actor extraction**: Request must have an actor in context (via `AsActor()`).
|
||||
2. **Actor extraction**: Request must have an actor in context (via `AsActor()`). The host is responsible for authenticating the caller before invoking the handler.
|
||||
3. **Upstream call**: Request forwarded to the AI provider
|
||||
4. **Response relay**: Response streamed/sent to client
|
||||
5. **Recording**: Token usage, prompts, and tool invocations recorded
|
||||
@@ -53,24 +55,29 @@ Passthrough routes (`/v1/models`, `/v1/messages/count_tokens`) are reverse-proxi
|
||||
|
||||
Create metrics with `NewMetrics(prometheus.Registerer)`:
|
||||
|
||||
| Metric | Type | Description |
|
||||
|-----------------------------------|-----------|-------------------------------|
|
||||
| `interceptions_total` | Counter | Intercepted request count |
|
||||
| `interceptions_inflight` | Gauge | Currently processing requests |
|
||||
| `interceptions_duration_seconds` | Histogram | Request duration |
|
||||
| `tokens_total` | Counter | Token usage (input/output) |
|
||||
| `prompts_total` | Counter | User prompt count |
|
||||
| `injected_tool_invocations_total` | Counter | MCP tool invocations |
|
||||
| `passthrough_total` | Counter | Non-intercepted requests |
|
||||
| Metric | Type | Description |
|
||||
|--------------------------------------|-----------|--------------------------------------------------------------------------|
|
||||
| `interceptions_total` | Counter | Intercepted request count |
|
||||
| `interceptions_inflight` | Gauge | Currently processing requests |
|
||||
| `interceptions_duration_seconds` | Histogram | Request duration |
|
||||
| `passthrough_total` | Counter | Non-intercepted requests forwarded to the upstream |
|
||||
| `prompts_total` | Counter | User prompt count |
|
||||
| `tokens_total` | Counter | Token usage (input, output, cache read/write, provider extras) |
|
||||
| `injected_tool_invocations_total` | Counter | Injected MCP tool invocations performed by the handler |
|
||||
| `non_injected_tool_selections_total` | Counter | Client-defined tool selections returned by the model |
|
||||
| `circuit_breaker_state` | Gauge | Circuit breaker state per provider/endpoint (0=closed, 0.5=half, 1=open) |
|
||||
| `circuit_breaker_trips_total` | Counter | Times the circuit breaker transitioned to open |
|
||||
| `circuit_breaker_rejects_total` | Counter | Requests rejected due to an open circuit breaker |
|
||||
|
||||
### Recorder Interface
|
||||
|
||||
Implement `Recorder` to persist usage data to your database:
|
||||
|
||||
- `aibridge_interceptions` - request metadata (provider, model, initiator, timestamps)
|
||||
- `aibridge_token_usages` - input/output token counts per response
|
||||
- `aibridge_token_usages` - input/output and cache read/write token counts per response
|
||||
- `aibridge_user_prompts` - user prompts
|
||||
- `aibridge_tool_usages` - tool invocations (injected and client-defined)
|
||||
- `aibridge_model_thoughts` - model reasoning content (thinking, reasoning summaries, commentary)
|
||||
|
||||
```go
|
||||
type Recorder interface {
|
||||
@@ -79,15 +86,32 @@ type Recorder interface {
|
||||
RecordTokenUsage(ctx context.Context, req *TokenUsageRecord) error
|
||||
RecordPromptUsage(ctx context.Context, req *PromptUsageRecord) error
|
||||
RecordToolUsage(ctx context.Context, req *ToolUsageRecord) error
|
||||
RecordModelThought(ctx context.Context, req *ModelThoughtRecord) error
|
||||
}
|
||||
```
|
||||
|
||||
## Supported Routes
|
||||
|
||||
Each provider instance is mounted under `/api/v2/aibridge/<name>`, where `<name>` is the provider's configured name. For example, with an Anthropic provider named `my-anthropic`, its `/messages` endpoint would be reachable at `/api/v2/aibridge/my-anthropic/v1/messages`.
|
||||
|
||||
If a name is not set, the route path defaults to the provider's type: `anthropic`, `openai`, or `copilot`. The table below uses the default names.
|
||||
|
||||
`(/*)` denotes a route that handles both the exact path and any subpaths. A trailing `/*` denotes subpaths only.
|
||||
|
||||
| Provider | Route | Type |
|
||||
|-----------|---------------------------------------|-----------------------|
|
||||
| Anthropic | `/anthropic/v1/messages` | Bridged (intercepted) |
|
||||
| Anthropic | `/anthropic/v1/models` | Passthrough |
|
||||
| Anthropic | `/anthropic/v1/messages/count_tokens` | Passthrough |
|
||||
| Anthropic | `/anthropic/v1/models(/*)` | Passthrough |
|
||||
| Anthropic | `/anthropic/api/event_logging/*` | Passthrough |
|
||||
| OpenAI | `/openai/v1/chat/completions` | Bridged (intercepted) |
|
||||
| OpenAI | `/openai/v1/models` | Passthrough |
|
||||
| OpenAI | `/openai/v1/responses` | Bridged (intercepted) |
|
||||
| OpenAI | `/openai/v1/responses/*` | Passthrough |
|
||||
| OpenAI | `/openai/v1/conversations(/*)` | Passthrough |
|
||||
| OpenAI | `/openai/v1/models(/*)` | Passthrough |
|
||||
| Copilot | `/copilot/chat/completions` | Bridged (intercepted) |
|
||||
| Copilot | `/copilot/responses` | Bridged (intercepted) |
|
||||
| Copilot | `/copilot/models(/*)` | Passthrough |
|
||||
| Copilot | `/copilot/agents/*` | Passthrough |
|
||||
| Copilot | `/copilot/mcp/*` | Passthrough |
|
||||
| Copilot | `/copilot/.well-known/*` | Passthrough |
|
||||
|
||||
@@ -7,6 +7,11 @@ There are two ways to connect AI tools to AI Gateway:
|
||||
- Base URL configuration (Recommended): Most AI tools allow customizing the base URL for API requests. This is the preferred approach when supported.
|
||||
- AI Gateway Proxy: For tools that don't support base URL configuration, [AI Gateway Proxy](../ai-gateway-proxy/index.md) can intercept traffic and forward it to AI Gateway.
|
||||
|
||||
> [!NOTE]
|
||||
> AI Gateway works with tools running inside or outside
|
||||
> of Coder workspaces. For non-workspace setup, see
|
||||
> [External and Desktop Clients](#external-and-desktop-clients).
|
||||
|
||||
## Base URLs
|
||||
|
||||
Most AI coding tools allow the "base URL" to be customized. In other words, when a request is made to OpenAI's API from your coding tool, the API endpoint such as [`/v1/chat/completions`](https://platform.openai.com/docs/api-reference/chat) will be appended to the configured base. Therefore, instead of the default base URL of `https://api.openai.com/v1`, you'll need to set it to `https://coder.example.com/api/v2/aibridge/openai/v1`.
|
||||
@@ -98,7 +103,7 @@ The table below shows tested AI clients and their compatibility with AI Gateway.
|
||||
| Cursor | ❌ | ❌ | ❌ | Override for OpenAI broken ([upstream issue](https://forum.cursor.com/t/requests-are-sent-to-incorrect-endpoint-when-using-base-url-override/144894)). |
|
||||
| Sourcegraph Amp | ❌ | ❌ | ❌ | No option to override base URL. |
|
||||
| Kiro | ❌ | ❌ | ❌ | No option to override base URL. |
|
||||
| Gemini CLI | ❌ | ❌ | ❌ | No Gemini API support. Upvote [this issue](https://github.com/coder/aibridge/issues/27). |
|
||||
| Gemini CLI | ❌ | ❌ | ❌ | No Gemini API support. Upvote [this issue](https://github.com/coder/coder/issues/24804). |
|
||||
| Antigravity | ❌ | ❌ | ❌ | No option to override base URL. |
|
||||
|
|
||||
|
||||
@@ -108,6 +113,8 @@ The table below shows tested AI clients and their compatibility with AI Gateway.
|
||||
|
||||
AI coding tools running inside a Coder workspace, such as IDE extensions, can be configured to use AI Gateway.
|
||||
|
||||
This section applies when you want template admins to preconfigure tools inside Coder workspaces. For tools running outside of a workspace, see [External and Desktop Clients](#external-and-desktop-clients).
|
||||
|
||||
While users can manually configure these tools with a long-lived API key, template admins can provide a more seamless experience by pre-configuring them. Admins can automatically inject the user's session token with `data.coder_workspace_owner.me.session_token` and the AI Gateway base URL into the workspace environment.
|
||||
|
||||
In this example, Claude Code respects these environment variables and will route all requests via AI Gateway.
|
||||
@@ -131,11 +138,38 @@ resource "coder_agent" "dev" {
|
||||
|
||||
## External and Desktop Clients
|
||||
|
||||
You can also configure AI tools running outside of a Coder workspace, such as local IDE extensions or desktop applications, to connect to AI Gateway.
|
||||
You can also configure AI tools running outside of a Coder workspace, such as local IDE extensions or desktop applications, to connect to AI Gateway. Use the same settings as the in-workspace case, configure the [base URL](#base-urls) and authenticate with a Coder API token.
|
||||
|
||||
The configuration is the same: point the tool to the AI Gateway [base URL](#base-urls) and use a Coder API key for authentication.
|
||||
For base URL setup, the client machine must have network access to the AI Gateway endpoint on your Coder deployment. Clients using [AI Gateway Proxy](../ai-gateway-proxy/index.md) must be able to reach the proxy endpoint and trust its CA certificate.
|
||||
|
||||
Users can generate a long-lived API key from the Coder UI or CLI. Follow the instructions at [Sessions and API tokens](../../../admin/users/sessions-tokens.md#generate-a-long-lived-api-token-on-behalf-of-yourself) to create one.
|
||||
Users can generate a long-lived API token from the Coder UI or CLI. Follow the instructions at [Sessions and API tokens](../../../admin/users/sessions-tokens.md#generate-a-long-lived-api-token-on-behalf-of-yourself) to create one.
|
||||
|
||||
For headless scenarios, first [create a service account](../../../admin/users/headless-auth.md#create-a-service-account), then generate a long-lived token for it.
|
||||
|
||||
<details>
|
||||
<summary>Example</summary>
|
||||
For clients supporting [base URL](#base-urls), eg. [Claude Code](./claude-code.md):
|
||||
|
||||
```sh
|
||||
export ANTHROPIC_BASE_URL="https://coder.example.com/api/v2/aibridge/anthropic"
|
||||
export ANTHROPIC_AUTH_TOKEN="<your-coder-api-token>"
|
||||
```
|
||||
|
||||
Replace `coder.example.com` with your Coder deployment URL.
|
||||
|
||||
For other clients setup [AI Gateway Proxy](../ai-gateway-proxy/index.md). Configure the proxy endpoint and [CA certificates](../ai-gateway-proxy/setup.md#environment-variables):
|
||||
|
||||
```sh
|
||||
export HTTPS_PROXY="https://coder:<your-coder-api-token>@<proxy-host>:8888"
|
||||
export SSL_CERT_FILE="/path/to/coder-aibridge-proxy-ca.pem"
|
||||
```
|
||||
|
||||
For proxy setup details, see [AI Gateway Proxy setup](../ai-gateway-proxy/setup.md).
|
||||
|
||||
For BYOK and workspace template examples, see full [Claude Code](./claude-code.md) example.
|
||||
</details>
|
||||
|
||||
For complete setup instructions, see the [supported client examples](#all-supported-clients).
|
||||
|
||||
## All Supported Clients
|
||||
|
||||
|
||||
@@ -5,6 +5,7 @@
|
||||
AI Gateway is a smart gateway for AI. It acts as an intermediary between your users' coding agents / IDEs
|
||||
and providers like OpenAI and Anthropic. By intercepting all the AI traffic between these clients and
|
||||
the upstream APIs, AI Gateway can record user prompts, token usage, and tool invocations.
|
||||
AI Gateway supports clients running inside or outside Coder workspaces.
|
||||
|
||||
AI Gateway solves 3 key problems:
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
## Implementation Details
|
||||
|
||||
`coderd` runs an in-memory instance of `aibridged`, whose logic is mostly contained in https://github.com/coder/aibridge. In future releases we will support running external instances for higher throughput and complete memory isolation from `coderd`.
|
||||
`coderd` runs an in-memory instance of `aibridged`, whose logic is mostly contained in https://github.com/coder/coder/tree/main/aibridge. In future releases we will support running external instances for higher throughput and complete memory isolation from `coderd`.
|
||||
|
||||

|
||||
|
||||
@@ -38,4 +38,4 @@ Where relevant, both streaming and non-streaming requests are supported.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
To report a bug, file a feature request, or view a list of known issues, please visit our [GitHub repository for AI Gateway](https://github.com/coder/aibridge). If you encounter issues with AI Gateway, please reach out to us via [Discord](https://discord.gg/coder).
|
||||
To report a bug, file a feature request, or view a list of known issues, please visit our [GitHub repository](https://github.com/coder/coder/issues). If you encounter issues with AI Gateway, please reach out to us via [Discord](https://discord.gg/coder).
|
||||
|
||||
Reference in New Issue
Block a user