Video Generation

Generate AI videos, talking avatar videos, face swaps, localizations, and video edits.

POST /videos/talking-avatar

Generate a talking avatar video with lip-sync. The avatar speaks a script using a selected voice, and the resulting video includes realistic lip movements synchronized to the audio.

You must provide either script + voice_id (to generate a new voiceover on the fly) or voiceover_id (to reuse an existing voiceover). If both are provided, voiceover_id takes precedence.

Credit cost: 20 credits per 250 characters of script (minimum 20, maximum 160).

Request Body

Field	Type	Required	Description
`avatar_id`	string	Yes	UUID of the avatar to use.
`avatar_type`	string	Yes	One of `"platform"`, `"custom"`, or `"product"`.
`script`	string	Conditional	The text the avatar will speak. Required if `voiceover_id` is not provided.
`voice_id`	string	Conditional	ElevenLabs voice ID. Required if `voiceover_id` is not provided.
`voiceover_id`	string	Conditional	UUID of an existing voiceover. Required if `script` and `voice_id` are not provided.
`lipsync_model`	string	No	Lip-sync model to use. Default: `"heygen-avatar4"`. Options: `"heygen-avatar4"`, `"lipsync-2.0"`.

Response

JSON

{
  "generation_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "type": "talking-avatar",
  "status": "processing"
}

Examples

curl -X POST https://www.adsumo.ai/api/v1/videos/talking-avatar \
  -H "Authorization: Bearer adsumo_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "avatar_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "avatar_type": "platform",
    "script": "Welcome to Adsumo. Create stunning AI video ads in minutes.",
    "voice_id": "EXAVITQu4vr4xnSDxMaL"
  }'

POST /videos/ai-video

Generate an AI video from a text prompt using one of several supported models. Each model has different capabilities, supported aspect ratios, and duration options.

Request Body

Field	Type	Required	Description
`model`	string	Yes	One of `"sora-2"`, `"sora-2-pro"`, `"veo-3.1"`, `"veo-3.1-fast"`, `"kling-2.6"`.
`prompt`	string	Yes	Text description of the video to generate.
`aspect_ratio`	string	Yes	Aspect ratio of the output video (e.g. `"16:9"`, `"9:16"`).
`duration`	number	Yes	Duration in seconds. Valid values depend on the model.
`resolution`	string	No	Output resolution. For example `"720p"`, `"1080p"`, `"4k"`.
`reference_image_url`	string	No	URL of a reference image to guide generation.
`hd`	boolean	No	Enable HD output. Applicable to Sora models only.
`generate_audio`	boolean	No	Generate an audio track alongside the video.
`first_frame_image_url`	string	No	URL of an image to use as the first frame. Applicable to Veo models only.
`last_frame_image_url`	string	No	URL of an image to use as the last frame. Applicable to Veo models only.

Model Constraints

Model	Aspect Ratios	Durations (seconds)	Notes
`sora-2`	`16:9`, `9:16`	4, 8, 12, 16, 20
`sora-2-pro`	`16:9`, `9:16`	4, 8, 12, 16, 20	Higher quality output
`veo-3.1`	any	4, 6, 8	Supports `first_frame_image_url` and `last_frame_image_url`
`veo-3.1-fast`	any	4, 6, 8	Faster generation, lower quality
`kling-2.6`	any	5, 10

Response

JSON

{
  "generation_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
  "type": "ai-video",
  "status": "processing"
}

Examples

curl -X POST https://www.adsumo.ai/api/v1/videos/ai-video \
  -H "Authorization: Bearer adsumo_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sora-2",
    "prompt": "A sleek product bottle rotating on a marble pedestal with soft studio lighting",
    "aspect_ratio": "16:9",
    "duration": 8,
    "hd": true
  }'

POST /videos/swap-avatar

Replace the face in an existing video with an avatar or a custom face image. You must provide either avatar_id + avatar_type or image_url to specify the replacement face.

Credit cost: 1 credit per second of video in standard mode, 2 credits per second in pro mode.

Request Body

Field	Type	Required	Description
`video_url`	string	Yes	URL of the source video.
`mode`	string	Yes	One of `"standard"` or `"pro"`. Pro mode produces higher fidelity face replacement.
`duration`	number	Yes	Duration of the video in seconds (1--30).
`width`	number	Yes	Width of the video in pixels.
`height`	number	Yes	Height of the video in pixels.
`avatar_id`	string	Conditional	UUID of the avatar to use as the replacement face. Required if `image_url` is not provided.
`avatar_type`	string	Conditional	One of `"custom"`, `"platform"`, or `"product"`. Required if `avatar_id` is provided.
`image_url`	string	Conditional	Direct URL to a face image. Required if `avatar_id` is not provided.

Response

JSON

{
  "generation_id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
  "type": "swap-avatar",
  "status": "processing"
}

POST /videos/localize

Translate and dub a video into another language. The audio is re-generated in the target language and lip movements are adjusted to match.

Request Body

Field	Type	Required	Description
`video_url`	string	Yes	URL of the source video.
`mode`	string	Yes	One of `"speed"` or `"precision"`. Speed mode is faster; precision mode produces higher quality dubbing.
`output_language`	string	Yes	Target language code (e.g. `"es"`, `"fr"`, `"de"`, `"ja"`).
`duration`	number	Yes	Duration of the video in seconds (1--120).
`width`	number	Yes	Width of the video in pixels.
`height`	number	Yes	Height of the video in pixels.
`translate_audio_only`	boolean	No	If `true`, only the audio track is translated -- lip movements are not adjusted.
`speaker_num`	number	No	Number of speakers in the video. Helps improve speaker diarization for multi-speaker content.
`enable_dynamic_duration`	boolean	No	If `true`, allows the output video duration to differ from the input to better accommodate the translated speech.

Response

JSON

{
  "generation_id": "d4e5f6a7-b8c9-0123-defa-234567890123",
  "type": "localize",
  "status": "processing"
}

POST /videos/edit

Add captions and visual effects to a video using an AI caption template. You must provide either video_url (for any video) or generation_id (to edit a previously generated AI video).

Credit cost: 2 credits.

Request Body

Field	Type	Required	Description
`template_id`	string	Yes	ID of the caption template to apply.
`video_url`	string	Conditional	URL of the video to edit. Required if `generation_id` is not provided.
`generation_id`	string	Conditional	UUID of a completed AI video generation. Required if `video_url` is not provided.
`language`	string	No	Language code for caption generation. Default: `"en"`.
`auto_approve`	boolean	No	Automatically approve the generated captions. Default: `true`.

Response

JSON

{
  "generation_id": "e5f6a7b8-c9d0-1234-efab-345678901234",
  "type": "edit",
  "status": "processing"
}

Checking Generation Status

All video generation endpoints return immediately with a "processing" status. To check whether a generation has completed, poll the generation status endpoint or configure a webhook. See the Overview for details on the async generation pattern.

curl https://www.adsumo.ai/api/v1/generations/a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  -H "Authorization: Bearer adsumo_sk_..."