Skip to main content
POST
/
v1
/
videos
/
generations
curl --request POST \
  --url https://toapis.com/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "doubao-seedance-2-0",
    "prompt": "Use video 1 for the POV composition throughout, use audio 1 as the background bed, start from image 1, and end on image 2 with a fresh commercial tone.",
    "duration": 11,
    "aspect_ratio": "16:9",
    "image_with_roles": [
      {"url": "https://example.com/ref-image-1.jpg", "role": "reference_image"},
      {"url": "https://example.com/ref-image-2.jpg", "role": "reference_image"}
    ],
    "video_with_roles": [
      {"url": "https://example.com/ref-video-1.mp4", "role": "reference_video"}
    ],
    "audio_with_roles": [
      {"url": "https://example.com/ref-audio-1.mp3", "role": "reference_audio"}
    ],
    "metadata": {
      "resolution": "720p",
      "generate_audio": true
    }
  }'
{
  "id": "<string>",
  "object": "<string>",
  "model": "<string>",
  "status": "<string>",
  "progress": 123,
  "created_at": 123,
  "metadata": {}
}
  • ByteDance’s next-generation video generation models
  • Supports doubao-seedance-2-0 and doubao-seedance-2-0-fast
  • Supports text-to-video, first-frame image-to-video, first-and-last-frame image-to-video, and multimodal reference-to-video
  • Supports reference images, reference videos, and reference audio
  • Supports synced audio generation, web search tools, and returning the last frame
  • Async task workflow with task ID based status queries

Authorizations

Authorization
string
required
All endpoints require Bearer Token authenticationGet your API Key from the API Key Management PageAdd to the request header:
Authorization: Bearer YOUR_API_KEY

Body

model
string
default:"doubao-seedance-2-0"
required
Video generation model nameAvailable models:
  • doubao-seedance-2-0 - Standard version focused on higher quality, supports 4-15 second output
  • doubao-seedance-2-0-fast - Faster version for preview and iteration, supports 4-12 second output
prompt
string
Video promptSupports both Chinese and English input. Describe scene, camera movement, subject actions, style, and audio atmosphere as clearly as possible.Recommendations:
  • Keep Chinese prompts within about 500 characters
  • Keep English prompts within about 1000 words
  • When referring to uploaded assets, use labels like “image 1”, “video 1”, or “audio 1”
Example: "Use video 1 for the POV composition throughout, start from image 1, end on image 2, and preserve the rhythm and mood from audio 1"
duration
integer
default:5
Video duration in secondsAllowed values:
  • doubao-seedance-2-0: 4-15
  • doubao-seedance-2-0-fast: 4-12
  • -1: automatic duration selected by the model
doubao-seedance-2-0-fast does not support durations longer than 12 seconds.
aspect_ratio
string
default:"adaptive"
Video aspect ratioOptions:
  • 21:9
  • 16:9
  • 4:3
  • 1:1
  • 3:4
  • 9:16
  • adaptive
adaptive behavior:
  • Text-to-video: the model chooses the most suitable ratio from the prompt
  • First-frame or first-and-last-frame image-to-video: derived from the first frame
  • Multimodal reference-to-video: typically prioritizes reference video, then reference image
image_urls
string[]
Image URL array in compatibility modeimage_with_roles is recommended for explicit control.image_urls and image_with_roles should not be used together.
image_with_roles
array
Image array with rolesSupported patterns:
  • First-frame image-to-video: one first_frame
  • First-and-last-frame image-to-video: one first_frame plus one last_frame
  • Multimodal reference-to-video: reference_image entries, 1-9 items
Image requirements:
  • Formats: jpeg, png, webp, bmp, tiff, gif
  • Per-image size: less than 30MB
  • Total request size: recommended within 64MB
  • Aspect ratio: about 0.4 to 2.5
  • Dimensions: about 300px to 6000px
  • First-frame and first-and-last-frame modes cannot be mixed with reference_image, reference_video, or reference_audio
  • Only one first_frame and one last_frame are allowed
  • In multimodal reference mode, all images should use reference_image
video_with_roles
array
Video array with rolesCurrently only reference_video is supported for multimodal reference mode.Video requirements:
  • Formats: mp4, mov
  • Resolution: 480p or 720p
  • Per-video duration: 2-15 seconds
  • Maximum count: 3
  • Total reference video duration: no more than 15 seconds
  • Per-video size: less than 50MB
  • Frame rate: about 24-60 FPS
audio_with_roles
array
Audio array with rolesCurrently only reference_audio is supported for multimodal reference mode.Audio requirements:
  • Formats: wav, mp3
  • Per-audio duration: 2-15 seconds
  • Maximum count: 3
  • Total reference audio duration: no more than 15 seconds
  • Per-audio size: less than 15MB
audio_with_roles cannot be used alone. At least one image or video reference is also required.
metadata
object
Extended parameters

Input Combination Rules

Typical supported combinations:
  • Text only: text-to-video
  • Text + one first-frame image: first-frame image-to-video
  • Text + first-frame image + last-frame image: first-and-last-frame image-to-video
  • Text + reference images: multimodal reference-to-video
  • Text + reference videos: video-guided reference-to-video
  • Text + reference images + reference audio: multimodal reference-to-video
  • Text + reference images + reference videos + reference audio: multimodal reference-to-video
These three modes are mutually exclusive:
  • First-frame image-to-video
  • First-and-last-frame image-to-video
  • Multimodal reference-to-video
If you need strict first-frame and last-frame control, prefer first_frame and last_frame. If you need broader reference guidance, use reference_image, reference_video, and reference_audio.

Resolution and Aspect Ratio Pixel Map

ResolutionAspect RatioPixel Size
480p16:9864x496
480p4:3752x560
480p1:1640x640
480p3:4560x752
480p9:16496x864
480p21:9992x432
720p16:91280x720
720p4:31112x834
720p1:1960x960
720p3:4834x1112
720p9:16720x1280
720p21:91470x630

Capabilities and Constraints

ItemSeedance 2.0Seedance 2.0 Fast
PositioningHigher qualityFaster generation and lower cost
Duration4-15 seconds, or -1 auto4-12 seconds, or -1 auto
Resolution480p / 720p480p / 720p
Image rolesfirst_frame / last_frame / reference_imagefirst_frame / last_frame / reference_image
Video rolesreference_videoreference_video
Audio rolesreference_audioreference_audio
Audio generationmetadata.generate_audiometadata.generate_audio
Toolsmetadata.toolsmetadata.tools
Return last framemetadata.return_last_framemetadata.return_last_frame
Pricing is billed per second. Final display prices may vary by model version, resolution, and marketplace strategy. Please refer to the model pricing page.

Response

id
string
Unique task identifier for status queries
object
string
Object type, always generation.task
model
string
Model name used
status
string
Task status
  • queued - Queued
  • in_progress - Processing
  • completed - Completed successfully
  • failed - Failed
progress
integer
Task progress percentage (0-100)
created_at
integer
Task creation timestamp (Unix timestamp)
metadata
object
Task metadata
Video generation is asynchronous. The create call returns a task ID, and you can use Get Video Task Status to poll for progress and results.
curl --request POST \
  --url https://toapis.com/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "doubao-seedance-2-0",
    "prompt": "Use video 1 for the POV composition throughout, use audio 1 as the background bed, start from image 1, and end on image 2 with a fresh commercial tone.",
    "duration": 11,
    "aspect_ratio": "16:9",
    "image_with_roles": [
      {"url": "https://example.com/ref-image-1.jpg", "role": "reference_image"},
      {"url": "https://example.com/ref-image-2.jpg", "role": "reference_image"}
    ],
    "video_with_roles": [
      {"url": "https://example.com/ref-video-1.mp4", "role": "reference_video"}
    ],
    "audio_with_roles": [
      {"url": "https://example.com/ref-audio-1.mp3", "role": "reference_audio"}
    ],
    "metadata": {
      "resolution": "720p",
      "generate_audio": true
    }
  }'