Video Generation

Generate AI videos, talking avatar videos, face swaps, localizations, and video edits.

POST /videos/talking-avatar

Generate a talking avatar video with lip-sync. The avatar speaks a script using a selected voice, and the resulting video includes realistic lip movements synchronized to the audio.

You must provide either script + voice_id (to generate a new voiceover on the fly) or voiceover_id (to reuse an existing voiceover). If both are provided, voiceover_id takes precedence.

Credit cost: 20 credits per 250 characters of script (minimum 20, maximum 160).

Request Body

FieldTypeRequiredDescription
avatar_idstringYesUUID of the avatar to use.
avatar_typestringYesOne of "platform", "custom", or "product".
scriptstringConditionalThe text the avatar will speak. Required if voiceover_id is not provided.
voice_idstringConditionalElevenLabs voice ID. Required if voiceover_id is not provided.
voiceover_idstringConditionalUUID of an existing voiceover. Required if script and voice_id are not provided.
lipsync_modelstringNoLip-sync model to use. Default: "heygen-avatar4". Options: "heygen-avatar4", "lipsync-2.0".

Response

JSON
{
  "generation_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "type": "talking-avatar",
  "status": "processing"
}

Examples

curl -X POST https://www.adsumo.ai/api/v1/videos/talking-avatar \
  -H "Authorization: Bearer adsumo_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "avatar_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "avatar_type": "platform",
    "script": "Welcome to Adsumo. Create stunning AI video ads in minutes.",
    "voice_id": "EXAVITQu4vr4xnSDxMaL"
  }'

POST /videos/ai-video

Generate an AI video from a text prompt using one of several supported models. Each model has different capabilities, supported aspect ratios, and duration options.

Request Body

FieldTypeRequiredDescription
modelstringYesOne of "sora-2", "sora-2-pro", "veo-3.1", "veo-3.1-fast", "kling-2.6".
promptstringYesText description of the video to generate.
aspect_ratiostringYesAspect ratio of the output video (e.g. "16:9", "9:16").
durationnumberYesDuration in seconds. Valid values depend on the model.
resolutionstringNoOutput resolution. For example "720p", "1080p", "4k".
reference_image_urlstringNoURL of a reference image to guide generation.
hdbooleanNoEnable HD output. Applicable to Sora models only.
generate_audiobooleanNoGenerate an audio track alongside the video.
first_frame_image_urlstringNoURL of an image to use as the first frame. Applicable to Veo models only.
last_frame_image_urlstringNoURL of an image to use as the last frame. Applicable to Veo models only.

Model Constraints

ModelAspect RatiosDurations (seconds)Notes
sora-216:9, 9:164, 8, 12, 16, 20
sora-2-pro16:9, 9:164, 8, 12, 16, 20Higher quality output
veo-3.1any4, 6, 8Supports first_frame_image_url and last_frame_image_url
veo-3.1-fastany4, 6, 8Faster generation, lower quality
kling-2.6any5, 10

Response

JSON
{
  "generation_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
  "type": "ai-video",
  "status": "processing"
}

Examples

curl -X POST https://www.adsumo.ai/api/v1/videos/ai-video \
  -H "Authorization: Bearer adsumo_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sora-2",
    "prompt": "A sleek product bottle rotating on a marble pedestal with soft studio lighting",
    "aspect_ratio": "16:9",
    "duration": 8,
    "hd": true
  }'

POST /videos/swap-avatar

Replace the face in an existing video with an avatar or a custom face image. You must provide either avatar_id + avatar_type or image_url to specify the replacement face.

Credit cost: 1 credit per second of video in standard mode, 2 credits per second in pro mode.

Request Body

FieldTypeRequiredDescription
video_urlstringYesURL of the source video.
modestringYesOne of "standard" or "pro". Pro mode produces higher fidelity face replacement.
durationnumberYesDuration of the video in seconds (1--30).
widthnumberYesWidth of the video in pixels.
heightnumberYesHeight of the video in pixels.
avatar_idstringConditionalUUID of the avatar to use as the replacement face. Required if image_url is not provided.
avatar_typestringConditionalOne of "custom", "platform", or "product". Required if avatar_id is provided.
image_urlstringConditionalDirect URL to a face image. Required if avatar_id is not provided.

Response

JSON
{
  "generation_id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
  "type": "swap-avatar",
  "status": "processing"
}

POST /videos/localize

Translate and dub a video into another language. The audio is re-generated in the target language and lip movements are adjusted to match.

Request Body

FieldTypeRequiredDescription
video_urlstringYesURL of the source video.
modestringYesOne of "speed" or "precision". Speed mode is faster; precision mode produces higher quality dubbing.
output_languagestringYesTarget language code (e.g. "es", "fr", "de", "ja").
durationnumberYesDuration of the video in seconds (1--120).
widthnumberYesWidth of the video in pixels.
heightnumberYesHeight of the video in pixels.
translate_audio_onlybooleanNoIf true, only the audio track is translated -- lip movements are not adjusted.
speaker_numnumberNoNumber of speakers in the video. Helps improve speaker diarization for multi-speaker content.
enable_dynamic_durationbooleanNoIf true, allows the output video duration to differ from the input to better accommodate the translated speech.

Response

JSON
{
  "generation_id": "d4e5f6a7-b8c9-0123-defa-234567890123",
  "type": "localize",
  "status": "processing"
}

POST /videos/edit

Add captions and visual effects to a video using an AI caption template. You must provide either video_url (for any video) or generation_id (to edit a previously generated AI video).

Credit cost: 2 credits.

Request Body

FieldTypeRequiredDescription
template_idstringYesID of the caption template to apply.
video_urlstringConditionalURL of the video to edit. Required if generation_id is not provided.
generation_idstringConditionalUUID of a completed AI video generation. Required if video_url is not provided.
languagestringNoLanguage code for caption generation. Default: "en".
auto_approvebooleanNoAutomatically approve the generated captions. Default: true.

Response

JSON
{
  "generation_id": "e5f6a7b8-c9d0-1234-efab-345678901234",
  "type": "edit",
  "status": "processing"
}

Checking Generation Status

All video generation endpoints return immediately with a "processing" status. To check whether a generation has completed, poll the generation status endpoint or configure a webhook. See the Overview for details on the async generation pattern.

curl https://www.adsumo.ai/api/v1/generations/a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  -H "Authorization: Bearer adsumo_sk_..."