Skip to main content

Available Models

Lilac currently supports the following models. We’re actively adding more — reach out if there’s a model you’d like to see.
ModelModel IDContext LengthInput PriceOutput Price
Kimi K2.5moonshotai/kimi-k2.5262,144 tokens$0.40 / M tokens$2.00 / M tokens
GLM 5.1zai-org/glm-5.1202,800 tokens$0.90 / M tokens$3.00 / M tokens
Gemma 4 (coming soon)google/gemma-4$0.13 / M tokens$0.38 / M tokens
More models are coming soon. Request a model by emailing contact@getlilac.com.

Kimi K2.5

Moonshot AI’s flagship multimodal reasoning model. 1T total parameters (32B activated) with a Mixture-of-Experts architecture.

Kimi K2.5 on Hugging Face

Model card, benchmarks, and deployment guides.

Capabilities

CapabilityStatusDetails
Text inputSupportedChat, instructions, system prompts
Image inputSupportedNative multimodal — pass images via image_url in messages
Text outputSupportedCompletions, structured JSON, tool calls
Reasoning (thinking)On by defaultChain-of-thought returned in reasoning field. Disable with chat_template_kwargs: {"thinking": false}
Tool callingSupportedFunction definitions with automatic argument extraction
Structured outputSupportedresponse_format with json_object or json_schema
From the Kimi K2.5 model card:
ModeTemperatureTop P
Thinking (default)1.00.95
Instant (thinking off)0.60.95

Vision

Kimi K2.5 natively supports image inputs. Pass images as base64 data URIs or URLs in the content array:
response = client.chat.completions.create(
    model="moonshotai/kimi-k2.5",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/image.jpg",
                        "detail": "auto"
                    }
                }
            ]
        }
    ],
)
You can also pass base64-encoded images:
import base64

with open("image.png", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="moonshotai/kimi-k2.5",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{image_b64}"}
                }
            ]
        }
    ],
)

GLM 5.1

Z.ai’s next-generation flagship model for agentic engineering. 754B total parameters in a Mixture-of-Experts architecture, with state-of-the-art coding capabilities — it holds up over long-horizon tasks, handles ambiguous problems well, and sustains hundreds of tool calls per run. 202.8K context window, 131.1K max output. MIT licensed.

GLM 5.1 on Hugging Face

Model card, benchmarks, and deployment guides.

Capabilities

CapabilityStatusDetails
Text inputSupportedChat, instructions, system prompts
Text outputSupportedCompletions, structured JSON, tool calls
Image inputNot supportedGLM 5.1 is text-only
Reasoning (thinking)On by defaultChain-of-thought returned in reasoning field. Disable with chat_template_kwargs: {"thinking": false}
Tool callingSupportedFunction definitions with automatic argument extraction — strong performance on agentic tasks
Structured outputSupportedresponse_format with json_object
From the Z.ai platform docs:
ModeTemperatureTop P
Thinking (default)1.00.95
Instant (thinking off)0.60.95

Example request

response = client.chat.completions.create(
    model="zai-org/glm-5.1",
    messages=[
        {"role": "user", "content": "Write a haiku about idle GPUs."}
    ],
)

Listing Models via API

from openai import OpenAI

client = OpenAI(
    base_url="https://api.getlilac.com/v1",
    api_key="your-lilac-api-key",
)

models = client.models.list()
for model in models:
    print(model.id)