{
  "name": "query-llm",
  "title": "query_llm - Cross-Model Single-Shot Queries",
  "description": "Cross-model single-shot queries: send prompts to different LLM models without tool access",
  "guid": "sk_plat_qllm",
  "category": "Agent Tools",
  "requiredTools": [
    "query_llm"
  ],
  "content": "# query_llm - Cross-Model Single-Shot Queries\n\n> Agent-side cross-model tool. For an LLM endpoint a **deployed app** calls, see [app-llm](app-llm.md).\n\nUse `query_llm` to send a one-off question to a different LLM model without switching the main conversation's model.\n\n## Key Behaviors\n- **Single-shot**: No conversation history, no tool access - just prompt in, response out\n- **No tools**: The queried model cannot use any tools (file, code, search, etc.)\n- **Independent**: Does not affect the main conversation's context or model setting\n\n## When to Use\n- Get a second opinion from a different model (\"What does GPT-5 think about this?\")\n- Use a cheaper model for simple tasks (classification, extraction, formatting)\n- Compare model outputs side-by-side\n- Process data that doesn't need tool access\n\n## Limits\n- **Max prompt**: 32,000 chars\n- **Max output**: 4,096 tokens\n- **Timeout**: 60s\n- **Image support**: Can include one workspace image with the query\n- **Cost**: Each query deducts credits (varies by model)\n\n## Default Models\nOpenAI: gpt-5-mini (cheapest). Anthropic: claude-haiku-4-5 (cheapest). Gemini: gemini-2.5-flash (cheapest, 1M context)\n\n## Available Gemini Models\ngemini-2.5-flash (Gemini 2.5 Flash, $0.15/$0.6 per 1M tok, 1049K ctx), gemini-2.5-pro (Gemini 2.5 Pro, $1.25/$10 per 1M tok, 1049K ctx), gemini-3-pro-preview (Gemini 3 Pro, $2/$12 per 1M tok, 200K ctx)\n\n## Tips\n- Great for bulk classification or extraction tasks where tool access isn't needed\n- Use `provider: \"gemini\"` or `model: \"gemini-2.5-flash\"` to query Gemini models\n- If you need tool access or multi-turn reasoning, use the main chat instead"
}
