Skip to main content
If you operate GPU infrastructure, Lilac lets you monetize idle capacity without changing your existing workloads. When your GPUs aren’t busy, Lilac’s operator automatically picks up inference traffic and routes it to your cluster. When you need the GPUs back, the operator gracefully steps aside.

How It Works

1

You keep your workloads

Your existing jobs always take priority. Lilac only uses GPUs when they’re idle.
2

The operator fills the gaps

Our Kubernetes operator detects idle GPUs, spins up inference pods, and serves traffic from Lilac’s inference network.
3

You earn per token

Every inference token processed on your hardware earns you revenue. You keep 80% of the gross — Lilac takes 20%.
4

Preemption is seamless

When your own workloads need GPUs back, the operator gracefully drains inference pods with zero impact on your jobs.

Revenue Model

Your share80% of gross token revenue
Lilac’s share20%
Payout basisPer token processed on your hardware
TrackingReal-time in the supplier dashboard

What You Need

  • A Kubernetes cluster with NVIDIA GPUs
  • kubectl access to the cluster
  • A Lilac supplier account (see Getting Started)

Next Steps

Get Started as a Supplier

Create your account, book onboarding, and get set up.