Alibaba Cloud Wanxiang video generation model
Supports Text-to-Video and Image-to-Video
Supports 720p/1080p resolution, 5/10/15 second duration
Supports automatic prompt extension and audio generation
Authentication
All endpoints require Bearer Token authentication Get your API Key from the API Key Management Page Add to request header: Authorization: Bearer YOUR_API_KEY
Request Parameters
Video generation model name, fixed as wan2.6
Video content description Required for text-to-video mode, recommended to describe scene, action, style in detail Example: "A cute cat stretching in the sunshine"
Reference image URL array (only supports 1 image) Required for image-to-video mode, supports public accessible URLs or Base64 encoding Example: ["https://example.com/image.jpg"] or ["data:image/png;base64,iVBORw0KGgo..."] The system automatically selects text-to-video or image-to-video mode based on whether image_urls is included in the request
Negative prompt, describe unwanted content Example: "blurry, low quality, deformed"
Video aspect ratio Available values:
16:9 - Landscape (default)
9:16 - Portrait
1:1 - Square
4:3 - Landscape
3:4 - Portrait
Default: 16:9 This parameter is not supported in image-to-video mode
Video resolution Available values:
720p - Standard definition (default)
1080p - High definition
Default: 720p 480p resolution is not supported Billed per second, different resolutions have different prices, refer to model marketplace for details
Video duration (seconds) Only supports: 5, 10, 15 seconds Default: 5
Random seed for reproducibility Example: 12345
Auto-extend prompt When enabled, the system will automatically optimize and enrich your prompt
Auto-add audio When enabled, the system will automatically generate matching audio for the video
Specify audio URL Takes priority over audio parameter Audio duration cannot exceed video duration. If audio is shorter than video, the first part will have sound and the latter part will be silent.
Shot type Available values:
single - Single shot
multi - Multi-shot
Effect template name for image-to-video effect mode When using effect mode:
Only one image is needed (passed through image_urls)
No prompt needed (model ignores prompt field)
General effects:
squish - Squish and squeeze
rotation - Spin around
poke - Poke fun
inflate - Balloon inflation
dissolve - Molecular diffusion
melt - Heat wave melt
icecream - Ice cream planet
flying - Magic levitation
Single person effects:
carousel - Time carousel
singleheart - Love you
dance1 - Swing moment
dance2 - Top dance
For more effects, refer to Alibaba Cloud Wanxiang template documentation
Resolution and Aspect Ratio Combinations
Aspect Ratio Description 720p Size 1080p Size 16:9 Landscape (default) 1280×720 1920×1080 9:16 Portrait 720×1280 1080×1920 1:1 Square 960×960 1440×1440 4:3 Landscape 1088×832 1632×1248 3:4 Portrait 832×1088 1248×1632
Response
Response status code, 200 for success
Response data array Task status, submitted upon initial submission
Unique task identifier for querying task status and results
Usage Scenarios
Scenario 1: Text-to-Video (Simple Request)
{
"model" : "wan2.6" ,
"prompt" : "A cute cat stretching in the sunshine"
}
Scenario 2: Text-to-Video (Complete Parameters)
{
"model" : "wan2.6" ,
"prompt" : "A cute cat running on grass" ,
"negative_prompt" : "blurry, low quality, deformed" ,
"aspect_ratio" : "16:9" ,
"resolution" : "720p" ,
"duration" : 5 ,
"seed" : 12345 ,
"prompt_extend" : true ,
"audio" : true ,
"shot_type" : "single" ,
"watermark" : false
}
Scenario 3: Image-to-Video
{
"model" : "wan2.6" ,
"prompt" : "Kitten running on the ground" ,
"image_urls" : [ "https://upload.apimart.ai/f/apimart-models-images/9998233432754770-c059992d-9b01-47d5-810d-ea0502ac9279-image_task_01KD7SSXDBCEWZ869D6PF249ZW_0.png" ],
"resolution" : "1080p" ,
"duration" : 10
}
Scenario 4: Image-to-Video (Base64 Image)
{
"model" : "wan2.6" ,
"prompt" : "Make the cat stand up and walk" ,
"image_urls" : [ "data:image/png;base64,iVBORw0KGgo..." ],
"duration" : 5
}
Mode Description
Text-to-Video
Must provide prompt parameter
No need for image_urls parameter
Image-to-Video
Must provide image_urls parameter (only supports 1 image)
prompt parameter is optional, used to describe expected actions
The system automatically selects the mode based on whether image_urls is included in the request
Query Task Results Video generation is an asynchronous task. After submission, a task_id is returned. Use the Get Task Status interface to query generation progress and results.
curl --request POST \
--url https://toapis.com/v1/videos/generations \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"model": "wan2.6",
"prompt": "A cute cat running on grass",
"aspect_ratio": "16:9",
"resolution": "720p",
"duration": 5
}'
{
"code" : 200 ,
"data" : [
{
"status" : "submitted" ,
"task_id" : "task_01J9HA7JPQ9A0Z6JZ3V8M9W6PZ"
}
]
}