This is the app service a deployed app calls. For the agent making one-off cross-model queries during a chat, see query-llm.

LLM service is available for every project with no setup needed (owner_pays billing by default). Apps can call AI models immediately after deploy.

Use project_settings to change settings (optional):

Billing Modes

Configuration Options

Available Models

claude-sonnet-4-6, claude-opus-4-6, claude-haiku-4-5, gpt-5.2, gpt-5, gpt-5-mini, gpt-5-nano, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gemini-2.5-flash, gemini-2.5-pro, gemini-3-pro-preview

Endpoints

Request Format

The endpoint accepts OpenAI-compatible messages:

Image Support

Both formats are accepted in message content arrays:

// OpenAI format (image_url with data URI)
{ type: 'image_url', image_url: { url: 'data:image/png;base64,iVBOR...' } }

// Native format
{ type: 'image', data: 'iVBOR...', media_type: 'image/png' }

Only data: URIs are supported - external image URLs will return a 400 error.

Response Format (OpenAI-compatible)

Non-streaming:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-5-mini",
  "choices": [{ "index": 0, "message": { "role": "assistant", "content": "..." }, "finish_reason": "stop" }],
  "usage": { "prompt_tokens": 100, "completion_tokens": 50, "total_tokens": 150 },
  "provider": "anthropic",
  "credits_used": 5
}

Streaming (SSE):

Client Code Example (Non-Streaming)

IMPORTANT: The token endpoint is on the API server, NOT the app host. You MUST use the absolute URL https://a.gipity.ai/api/token - never a relative path like /api/token. It is a POST request and the token is nested under data.

// 1. Get app token - MUST be absolute URL to API server, POST with app GUID
const tokenRes = await fetch('https://a.gipity.ai/api/token', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ app: '<PROJECT_GUID>' })
});
const { data: { token } } = await tokenRes.json();
// ✗ WRONG: fetch('/api/token')           - relative URL hits app host, not API
// ✗ WRONG: const { token } = await ...   - token is inside data: { data: { token } }

// 2. Call the LLM
const res = await fetch('https://a.gipity.ai/api/<PROJECT_GUID>/services/llm', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json', 'X-App-Token': token },
  body: JSON.stringify({
    messages: [
      { role: 'system', content: 'Answer concisely.' },
      { role: 'user', content: 'What is the capital of France?' }
    ],
    model: 'gpt-5-mini'
  })
});
const data = await res.json();
const answer = data.choices[0].message.content; // "The capital of France is Paris."

Client Code Example (Streaming)

const res = await fetch('https://a.gipity.ai/api/<PROJECT_GUID>/services/llm', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json', 'X-App-Token': token },
  body: JSON.stringify({
    messages: [{ role: 'user', content: 'Write a story' }],
    stream: true
  })
});
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n\n');
  buffer = lines.pop();
  for (const line of lines) {
    if (!line.startsWith('data: ')) continue;
    const raw = line.slice(6);
    if (raw === '[DONE]') break;
    const chunk = JSON.parse(raw);
    const content = chunk.choices?.[0]?.delta?.content;
    if (content) process.stdout.write(content);
    if (chunk.choices?.[0]?.finish_reason === 'stop') {
      console.log('\nUsage:', chunk.usage);
    }
  }
}

Image Description Example

const res = await fetch('https://a.gipity.ai/api/<PROJECT_GUID>/services/llm', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json', 'X-App-Token': token },
  body: JSON.stringify({
    messages: [{
      role: 'user',
      content: [
        { type: 'text', text: 'Describe this image' },
        { type: 'image_url', image_url: { url: 'data:image/png;base64,iVBOR...' } }
      ]
    }]
  })
});
const data = await res.json();
const description = data.choices[0].message.content;

Limits

Testing

The LLM service is tested end-to-end: an E2E test asks the agent to build an app that calls the LLM, deploys it, then verifies the page renders the correct AI response in a headless browser.