Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getlilac.com/llms.txt

Use this file to discover all available pages before exploring further.

If you operate GPU infrastructure, Lilac lets you monetize idle capacity without changing your existing workloads. When your GPUs aren’t busy, Lilac’s operator automatically picks up inference traffic and routes it to your cluster. When you need the GPUs back, the operator gracefully steps aside.

How It Works

1

You keep your workloads

Your existing jobs always take priority. Lilac only uses GPUs when they’re idle.
2

The operator fills the gaps

Our Kubernetes operator detects idle GPUs, spins up inference pods, and serves traffic from Lilac’s inference network.
3

You earn per token

Every inference token processed on your hardware earns you revenue. You keep 70% of the gross — Lilac takes 30%.
4

Preemption is seamless

When your own workloads need GPUs back, the operator gracefully drains inference pods with zero impact on your jobs.

Revenue Model

Your share70% of gross token revenue
Lilac’s share30%
Payout basisPer token processed on your hardware
TrackingReal-time in the supplier dashboard

What You Need

  • A Kubernetes cluster with NVIDIA GPUs
  • kubectl access to the cluster
  • A Lilac supplier account (see Getting Started)

Next Steps

Get Started as a Supplier

Create your account, book onboarding, and get set up.