Cluster Monitoring

Monitor your cluster’s connection status and workloads using the Lilac dashboard and Kubernetes tools.

Dashboard

The Clusters section of the Lilac console shows:

Status	Meaning
Connected	Operator is syncing normally with the control plane
Degraded	Sync has failed recently — operator will retry on next cycle
Draining	Disconnected for 10+ minutes — inference pods are being gracefully removed

Detailed usage statistics (tokens processed, revenue earned) are sent in your monthly report. See Revenue & Payouts for details.

kubectl get gpupool -n lilac-system

kubectl logs -n lilac-system deploy/lilac-gpu-operator -f

Key log events to watch:

Event	Meaning
`control plane sync successful`	Normal sync completed
`workload created`	New inference pod deployed
`preemption triggered`	GPUs being reclaimed for your workloads
`workload drained`	Inference pod gracefully removed
`sync failed`	Control plane unreachable — will retry

kubectl get pods -n lilac-system -l app.kubernetes.io/managed-by=lilac

The operator emits Kubernetes events for key state transitions:

Event	Description
`PoolCleanedUp`	All managed workloads deleted from pool
`ControlPlaneDegraded`	Control plane stopped responding
`ControlPlaneDisconnected`	Disconnect timeout elapsed, draining workloads
`WorkloadPreempted`	Workload evicted after grace period
`WorkloadDraining`	Draining began (includes reason and grace period)

View events:

kubectl get events -n lilac-system --sort-by='.lastTimestamp'