coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 21:18:24 +00:00

Author	SHA1	Message	Date
Kacper Sawicki	6f86f67754	feat(coderd): add overload protection with rate limiting and concurrency control (#21161 ) ## Summary This adds configurable overload protection to the AI Bridge daemon to prevent the server from being overwhelmed during periods of high load. Partially addresses coder/internal#1153 (rate limits and concurrency control; circuit breakers are deferred to a follow-up). ## New Configuration Options \| Option \| Environment Variable \| Description \| Default \| \|--------\|---------------------\|-------------\|---------\| \| `--aibridge-max-concurrency` \| `CODER_AIBRIDGE_MAX_CONCURRENCY` \| Maximum number of concurrent AI Bridge requests. Set to 0 to disable (unlimited). \| `0` \| \| `--aibridge-rate-limit` \| `CODER_AIBRIDGE_RATE_LIMIT` \| Maximum number of AI Bridge requests per second. Set to 0 to disable rate limiting. \| `0` \| ## Behavior When limits are exceeded: - Concurrency limit: Returns HTTP `503 Service Unavailable` with message "AI Bridge is currently at capacity. Please try again later." - Rate limit: Returns HTTP `429 Too Many Requests` with `Retry-After` header. Both protections are optional and disabled by default (0 values). ## Implementation The overload protection is implemented as reusable middleware in `coderd/httpmw/ratelimit.go`: 1. `RateLimitByAuthToken`: Per-user rate limiting that uses `APITokenFromRequest` to extract the authentication token, with fallback to `X-Api-Key` header for AI provider compatibility (e.g., Anthropic). Falls back to IP-based rate limiting if no token is present. Includes `Retry-After` header for backpressure signaling. 2. `ConcurrencyLimit`: Uses an atomic counter to track in-flight requests and reject when at capacity. The middleware is applied in `enterprise/coderd/aibridge.go` via `r.Group` in the following order: 1. Concurrency check (faster rejection for load shedding) 2. Rate limit check Note: Rate limiting currently applies to all AI Bridge requests, including pass-through requests. Ideally only actual interceptions should count, but this would require changes in the aibridge library. ## Testing Added comprehensive tests for: - Rate limiting by auth token (Bearer token, X-Api-Key, no token fallback to IP) - Different tokens not rate limited against each other - Disabled when limit is zero - Retry-After header is set on 429 responses - Concurrency limiting (allows within limit, rejects over limit, disabled when zero)	2025-12-11 16:38:54 +01:00
Steven Masley	1d1070d051	chore: ensure proper rbac permissions on 'Acquire' file in the cache (#18348 ) The file cache was caching the `Unauthorized` errors if a user without the right perms opened the file first. So all future opens would fail. Now the cache always opens with a subject that can read files. And authz is checked on the Acquire per user.	2025-06-16 13:40:45 +00:00
Steven Masley	eeb3d63be6	chore: merge authorization contexts (#12816 ) * chore: merge authorization contexts Instead of 2 auth contexts from apikey and dbauthz, merge them to just use dbauthz. It is annoying to have two. * fixup authorization reference	2024-03-29 10:14:27 -05:00
Kyle Carberry	22e781eced	chore: add /v2 to import module path (#9072 ) * chore: add /v2 to import module path go mod requires semantic versioning with versions greater than 1.x This was a mechanical update by running: ``` go install github.com/marwan-at-work/mod/cmd/mod@latest mod upgrade ``` Migrate generated files to import /v2 * Fix gen	2023-08-18 18:55:43 +00:00
Steven Masley	b0a16150a3	chore: Implement standard rbac.Subject to be reused everywhere (#5881 ) * chore: Implement standard rbac.Subject to be reused everywhere An rbac subject is created in multiple spots because of the way we expand roles, scopes, etc. This difference in use creates a list of arguments which is unwieldy. Use of the expander interface lets us conform to a single subject in every case	2023-01-26 14:42:54 -06:00
Ammar Bandukwala	423ac04156	coderd: tighten /login rate limiting (#4432 ) * coderd: tighten /login rate limit * coderd: add Bypass rate limit header	2022-10-20 17:01:23 +00:00
Colin Adler	5de6f86959	feat: trace httpapi.{Read,Write} (#4134 )	2022-09-21 17:07:00 -05:00
Jon Ayers	7e9819f2a8	ref: move httpapi.Reponse into codersdk (#2954 )	2022-07-12 19:15:02 -05:00
Steven Masley	548de7d6f3	feat: User pagination using offsets (#1062 ) Offset pagination and cursor pagination supported	2022-04-22 15:27:55 -05:00
Garrett Delfosse	d9d4599ba9	chore: idea: unify http responses further (#941 )	2022-04-12 10:17:33 -05:00
Kyle Carberry	31536186f7	feat: Add rate-limits to the API (#848 ) Closes #285.	2022-04-04 17:32:05 -05:00

11 Commits