Agent-side cross-model tool. For an LLM endpoint a deployed app calls, see app-llm.

Use query_llm to send a one-off question to a different LLM model without switching the main conversation's model.

Key Behaviors

When to Use

Limits

Default Models

OpenAI: gpt-5-mini (cheapest). Anthropic: claude-haiku-4-5 (cheapest). Gemini: gemini-2.5-flash (cheapest, 1M context)

Available Gemini Models

gemini-2.5-flash (Gemini 2.5 Flash, $0.15/$0.6 per 1M tok, 1049K ctx), gemini-2.5-pro (Gemini 2.5 Pro, $1.25/$10 per 1M tok, 1049K ctx), gemini-3-pro-preview (Gemini 3 Pro, $2/$12 per 1M tok, 200K ctx)

Tips