Commit Graph

15 Commits

Author SHA1 Message Date
Kacper Sawicki 6f86f67754 feat(coderd): add overload protection with rate limiting and concurrency control (#21161)
## Summary

This adds configurable overload protection to the AI Bridge daemon to
prevent the server from being overwhelmed during periods of high load.

Partially addresses coder/internal#1153 (rate limits and concurrency
control; circuit breakers are deferred to a follow-up).

## New Configuration Options

| Option | Environment Variable | Description | Default |
|--------|---------------------|-------------|---------|
| `--aibridge-max-concurrency` | `CODER_AIBRIDGE_MAX_CONCURRENCY` |
Maximum number of concurrent AI Bridge requests. Set to 0 to disable
(unlimited). | `0` |
| `--aibridge-rate-limit` | `CODER_AIBRIDGE_RATE_LIMIT` | Maximum number
of AI Bridge requests per second. Set to 0 to disable rate limiting. |
`0` |

## Behavior

When limits are exceeded:
- **Concurrency limit**: Returns HTTP `503 Service Unavailable` with
message "AI Bridge is currently at capacity. Please try again later."
- **Rate limit**: Returns HTTP `429 Too Many Requests` with
`Retry-After` header.

Both protections are optional and disabled by default (0 values).

## Implementation

The overload protection is implemented as reusable middleware in
`coderd/httpmw/ratelimit.go`:

1. **`RateLimitByAuthToken`**: Per-user rate limiting that uses
`APITokenFromRequest` to extract the authentication token, with fallback
to `X-Api-Key` header for AI provider compatibility (e.g., Anthropic).
Falls back to IP-based rate limiting if no token is present. Includes
`Retry-After` header for backpressure signaling.
2. **`ConcurrencyLimit`**: Uses an atomic counter to track in-flight
requests and reject when at capacity.

The middleware is applied in `enterprise/coderd/aibridge.go` via
`r.Group` in the following order:
1. Concurrency check (faster rejection for load shedding)
2. Rate limit check

**Note**: Rate limiting currently applies to all AI Bridge requests,
including pass-through requests. Ideally only actual interceptions
should count, but this would require changes in the aibridge library.

## Testing

Added comprehensive tests for:
- Rate limiting by auth token (Bearer token, X-Api-Key, no token
fallback to IP)
- Different tokens not rate limited against each other
- Disabled when limit is zero
- Retry-After header is set on 429 responses
- Concurrency limiting (allows within limit, rejects over limit,
disabled when zero)
2025-12-11 16:38:54 +01:00
Hugo Dutka 782d01bae2 chore(coderd/httpmw): remove dbmem usage from tests (#18146)
Related to https://github.com/coder/coder/issues/15109
2025-06-02 13:57:56 +02:00
Steven Masley 5ccf5084e8 chore: create type for unique role names (#13506)
* chore: create type for unique role names

Using `string` was confusing when something should be combined with
org context, and when not to. Naming this new name, "RoleIdentifier"
2024-06-11 08:55:28 -05:00
Kyle Carberry 5abfe5afd0 chore: rename dbfake to dbmem (#10432) 2023-10-30 17:42:20 +00:00
Kyle Carberry 22e781eced chore: add /v2 to import module path (#9072)
* chore: add /v2 to import module path

go mod requires semantic versioning with versions greater than 1.x

This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```

Migrate generated files to import /v2

* Fix gen
2023-08-18 18:55:43 +00:00
Dean Sheather 1bdd2abed7 feat: use JWT ticket to avoid DB queries on apps (#6148)
Issue a JWT ticket on the first request with a short expiry that
contains details about which workspace/agent/app combo the ticket is
valid for.
2023-03-07 19:38:11 +00:00
Kyle Carberry 17adfd1134 chore: improve times of ratelimit tests (#6346)
From 5s to 130ms!
2023-02-25 22:01:01 +00:00
Steven Masley 8b424f03c2 chore: Rename databasefake --> dbfake (#6011) 2023-02-02 19:28:55 -06:00
Kyle Carberry 026b1cd2a4 chore: update to go 1.20 (#5968)
Co-authored-by: Colin Adler <colin1adler@gmail.com>
2023-02-02 12:36:27 -06:00
Steven Masley 4a6fc40949 feat: Add database data generator to make fakedbs easier to populate (#5922)
* feat: Add database data generator to make fakedbs easier to populate
2023-01-31 15:10:03 -06:00
Kyle Carberry 7ad87505c8 chore: move agent functions from codersdk into agentsdk (#5903)
* chore: rename `AgentConn` to `WorkspaceAgentConn`

The codersdk was becoming bloated with consts for the workspace
agent that made no sense to a reader. `Tailnet*` is an example
of these consts.

* chore: remove `Get` prefix from *Client functions

* chore: remove `BypassRatelimits` option in `codersdk.Client`

It feels wrong to have this as a direct option because it's so infrequently
needed by API callers. It's better to directly modify headers in the two
places that we actually use it.

* Merge `appearance.go` and `buildinfo.go` into `deployment.go`

* Merge `experiments.go` and `features.go` into `deployment.go`

* Fix `make gen` referencing old type names

* Merge `error.go` into `client.go`

`codersdk.Response` lived in `error.go`, which is wrong.

* chore: refactor workspace agent functions into agentsdk

It was odd conflating the codersdk that clients should use
with functions that only the agent should use. This separates
them into two SDKs that are closely coupled, but separate.

* Merge `insights.go` into `deployment.go`

* Merge `organizationmember.go` into `organizations.go`

* Merge `quota.go` into `workspaces.go`

* Rename `sse.go` to `serversentevents.go`

* Rename `codersdk.WorkspaceAppHostResponse` to `codersdk.AppHostResponse`

* Format `.vscode/settings.json`

* Fix outdated naming in `api.ts`

* Fix app host response

* Fix unsupported type

* Fix imported type
2023-01-29 15:47:24 -06:00
Ammar Bandukwala 423ac04156 coderd: tighten /login rate limiting (#4432)
* coderd: tighten /login rate limit

* coderd: add Bypass rate limit header
2022-10-20 17:01:23 +00:00
Kyle Carberry b0fe9bcdd1 chore: Upgrade to Go 1.19 (#3617)
This is required as part of #3505.
2022-08-21 22:32:53 +00:00
Mathias Fredriksson 4730c589fe chore: Use standardized test timeouts and delays (#3291) 2022-08-01 15:45:05 +03:00
Kyle Carberry 31536186f7 feat: Add rate-limits to the API (#848)
Closes #285.
2022-04-04 17:32:05 -05:00