Three audio services available for every project with no setup needed (owner_pays by default).
Use project_settings to customize each independently (optional).
Sound Effects
Generate sound effects from text descriptions using ElevenLabs.
Endpoint
POST /api/<PROJECT_GUID>/services/sound
Request
{
"text": "thunder rumbling in the distance",
"duration_seconds": 5,
"prompt_influence": 0.5
}
text(required): Description of the sound, max 1,000 charsduration_seconds: 0.5-30 (optional, provider decides if omitted)prompt_influence: 0-1, how closely to follow the prompt (optional)
Response
{
"url": "https://media.gipity.ai/med_abc12345.mp3",
"duration_seconds": 5,
"credits_used": 2
}
Music Generation
Generate music from text prompts using ElevenLabs.
Endpoint
POST /api/<PROJECT_GUID>/services/music
Request
{
"prompt": "upbeat lo-fi hip hop beat with piano and soft drums",
"duration_seconds": 30,
"instrumental": true
}
prompt(required): Music description, max 2,000 charsduration_seconds: 3-600 (optional, default ~30s)instrumental: true to force no vocals (optional)
Response
{
"url": "https://media.gipity.ai/med_abc12345.mp3",
"duration_seconds": 30,
"credits_used": 3
}
Audio Transcription
Transcribe audio files to text. Supports ElevenLabs (Scribe v2) and OpenAI (GPT-4o Transcribe).
Endpoint
POST /api/<PROJECT_GUID>/services/transcribe (multipart/form-data)
Request
Send as multipart/form-data:
audio(required): Audio file (MP3, WAV, M4A, etc.), max 100MBprovider: "elevenlabs" (default) or "openai"language: Language code (e.g., "en", "es") - optional, auto-detecteddiarize: "true" to identify speakers (optional)
Response
{
"text": "Hello, this is a transcription test.",
"words": [{ "text": "Hello", "start": 0.0, "end": 0.5, "type": "word" }],
"language": "en",
"duration_seconds": 12.5,
"provider": "elevenlabs",
"credits_used": 5
}
Client Code Examples
// Get token first
const tokenRes = await fetch('https://a.gipity.ai/api/token', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ app: '<PROJECT_GUID>' })
});
const { data: { token } } = await tokenRes.json();
// Sound effect
const soundRes = await fetch('https://a.gipity.ai/api/<PROJECT_GUID>/services/sound', {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'X-App-Token': token },
body: JSON.stringify({ text: 'door creaking open slowly' })
});
const sound = await soundRes.json();
new Audio(sound.url).play();
// Music
const musicRes = await fetch('https://a.gipity.ai/api/<PROJECT_GUID>/services/music', {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'X-App-Token': token },
body: JSON.stringify({ prompt: 'calm ambient piano', duration_seconds: 60 })
});
const music = await musicRes.json();
new Audio(music.url).play();
// Transcription (from file input)
const formData = new FormData();
formData.append('audio', fileInput.files[0]);
formData.append('diarize', 'true');
const transRes = await fetch('https://a.gipity.ai/api/<PROJECT_GUID>/services/transcribe', {
method: 'POST',
headers: { 'X-App-Token': token },
body: formData
});
const transcript = await transRes.json();
console.log(transcript.text);
Limits
- Rate limit: 600 requests per 5-minute window (per IP, all audio endpoints)
- Sound text: max 1,000 chars, duration 0.5-30s, timeout 60s
- Music prompt: max 2,000 chars, duration 3-600s, timeout 120s
- Transcription: max 100MB file, timeout 120s
- Standard
RateLimit-*headers included in responses