The Lilac API applies a default rate limit per organization to keep shared inference capacity fair across customers.Documentation Index
Fetch the complete documentation index at: https://docs.getlilac.com/llms.txt
Use this file to discover all available pages before exploring further.
Default Limit
- 200 requests per minute per organization.
429 Too Many Requests
Requests above the limit may receive an HTTP429 Too Many Requests response. When this happens:
- Back off and retry later.
- If a
Retry-Afterheader orretry_afterfield is present, respect it. - Use exponential backoff for automated clients.
Example
429 responses are surfaced as RateLimitError (Python) / RateLimitError (JS). The SDK will retry transient errors with exponential backoff by default — keep that behavior or implement your own.

