# Lilac

> Documentation for Lilac Cloud — affordable GPU inference powered by idle enterprise GPUs.

## Docs

- [API Keys](https://docs.getlilac.com/inference/api-keys.md): Create, rotate, and manage API keys that authenticate your inference requests to the Lilac API. Each key is scoped to an organization.
- [Chat Completions](https://docs.getlilac.com/inference/chat-completions.md): Use the OpenAI-compatible chat completions endpoint to generate model responses from conversation history, with streaming and tool calling.
- [Completions (Legacy)](https://docs.getlilac.com/inference/completions.md): Use the legacy completions endpoint to generate text from a raw prompt string. For new integrations, use the chat completions endpoint instead.
- [Connect to local coding tools](https://docs.getlilac.com/inference/local-tools.md): Set up Lilac inference in your local coding environment with OpenCode and other OpenAI-compatible AI code assistants.
- [Supported Models](https://docs.getlilac.com/inference/models.md): Browse all models available on Lilac including Kimi K2.6, GLM 5.1, Gemma 4, and MiniMax M2.7 with context lengths, capabilities, and per-token pricing.
- [OpenAI Compatibility](https://docs.getlilac.com/inference/openai-compatibility.md): Learn which OpenAI API features Lilac supports and how to migrate from OpenAI by changing just the base URL and API key in your existing code.
- [Organizations & Invites](https://docs.getlilac.com/inference/organizations.md): Manage your Lilac organization, invite team members by email, assign owner or member roles, and control API key and billing access.
- [Inference Pricing](https://docs.getlilac.com/inference/pricing.md): Lilac offers pay-per-token inference pricing with no minimums or contracts. See per-model rates for input and output tokens powered by idle GPUs.
- [Inference Quickstart](https://docs.getlilac.com/inference/quickstart.md): Get started with the Lilac inference API in under five minutes. Create an account, add credits, generate an API key, and send your first request.
- [API Rate Limits and 429 Handling](https://docs.getlilac.com/inference/rate-limits.md): Default per-organization rate limits for the Lilac inference API, how 429 Too Many Requests responses work, and recommended retry and backoff behavior.
- [Responses API](https://docs.getlilac.com/inference/responses.md): Use the responses endpoint, OpenAI's newer API format, to generate structured output and call tools with built-in support for JSON schemas.
- [API Status and Model Performance](https://docs.getlilac.com/inference/status.md): Query the public Lilac status endpoint for live model uptime, throughput, and time-to-first-token metrics across configurable aggregation windows.
- [Usage & Billing](https://docs.getlilac.com/inference/usage.md): Monitor your token consumption, view per-model cost breakdowns, and manage prepaid credit billing from the Lilac dashboard in real time.
- [Supplier Getting Started](https://docs.getlilac.com/suppliers/getting-started.md): Become a Lilac GPU supplier in four steps — create an account, submit the intake form, complete onboarding, and install the Kubernetes operator.
- [Cluster Monitoring](https://docs.getlilac.com/suppliers/monitoring.md): Monitor your GPU cluster connection status, workload activity, and operator health using the Lilac dashboard and Kubernetes debugging tools.
- [GPU Pool Configuration](https://docs.getlilac.com/suppliers/operator/gpu-pools.md): Define GPU pool custom resources to control which nodes, how many GPUs, availability schedules, and preemption rules Lilac uses in your cluster.
- [How the Operator Works](https://docs.getlilac.com/suppliers/operator/how-it-works.md): Understand the Lilac GPU operator architecture, its 30-second sync loop with the control plane, and how it manages inference pods on idle GPUs.
- [Operator Installation](https://docs.getlilac.com/suppliers/operator/installation.md): Install the Lilac GPU operator in your Kubernetes cluster using Helm. Covers prerequisites, namespace setup, chart configuration, and verification.
- [GPU Preemption](https://docs.getlilac.com/suppliers/operator/preemption.md): Learn how the Lilac operator gracefully reclaims GPUs when your workloads need them back, using LIFO eviction and configurable grace periods.
- [Supplier Overview](https://docs.getlilac.com/suppliers/overview.md): Monetize idle GPU capacity with Lilac. Keep your existing workloads running, earn 70% of inference revenue, and reclaim resources instantly on demand.
- [Revenue & Payouts](https://docs.getlilac.com/suppliers/revenue.md): Understand how Lilac calculates supplier earnings with the 70/30 revenue split, tracks per-token payouts, and processes monthly payment cycles.

## Optional

- [Talk to Us](https://calendly.com/d/ctxy-jd8-585/lilac-support)
- [Contact Us](mailto:contact@getlilac.com)