Skip to main content
POST
https://toapis.com
/
v1
/
videos
/
generations
curl --request POST \
  --url https://toapis.com/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "wan2.6",
    "prompt": "A cute cat running on grass",
    "aspect_ratio": "16:9",
    "resolution": "720p",
    "duration": 5
  }'
{
  "code": 200,
  "data": [
    {
      "status": "submitted",
      "task_id": "task_01J9HA7JPQ9A0Z6JZ3V8M9W6PZ"
    }
  ]
}
  • Alibaba Cloud Wanxiang video generation model
  • Supports Text-to-Video and Image-to-Video
  • Supports 720p/1080p resolution, 5/10/15 second duration
  • Supports automatic prompt extension and audio generation

Authentication

Authorization
string
required
All endpoints require Bearer Token authenticationGet your API Key from the API Key Management PageAdd to request header:
Authorization: Bearer YOUR_API_KEY

Request Parameters

model
string
required
Video generation model name, fixed as wan2.6
prompt
string
required
Video content descriptionRequired for text-to-video mode, recommended to describe scene, action, style in detailExample: "A cute cat stretching in the sunshine"
image_urls
array
Reference image URL array (only supports 1 image)Required for image-to-video mode, supports public accessible URLs or Base64 encodingExample: ["https://example.com/image.jpg"] or ["data:image/png;base64,iVBORw0KGgo..."]The system automatically selects text-to-video or image-to-video mode based on whether image_urls is included in the request
negative_prompt
string
Negative prompt, describe unwanted contentExample: "blurry, low quality, deformed"
aspect_ratio
string
default:"16:9"
Video aspect ratioAvailable values:
  • 16:9 - Landscape (default)
  • 9:16 - Portrait
  • 1:1 - Square
  • 4:3 - Landscape
  • 3:4 - Portrait
Default: 16:9This parameter is not supported in image-to-video mode
resolution
string
default:"720p"
Video resolutionAvailable values:
  • 720p - Standard definition (default)
  • 1080p - High definition
Default: 720p480p resolution is not supportedBilled per second, different resolutions have different prices, refer to model marketplace for details
duration
integer
default:"5"
Video duration (seconds)Only supports: 5, 10, 15 secondsDefault: 5
seed
integer
Random seed for reproducibilityExample: 12345
prompt_extend
boolean
Auto-extend promptWhen enabled, the system will automatically optimize and enrich your prompt
audio
boolean
Auto-add audioWhen enabled, the system will automatically generate matching audio for the video
audio_url
string
Specify audio URLTakes priority over audio parameterAudio duration cannot exceed video duration. If audio is shorter than video, the first part will have sound and the latter part will be silent.
shot_type
string
Shot typeAvailable values:
  • single - Single shot
  • multi - Multi-shot
watermark
boolean
Add watermark
template
string
Effect template name for image-to-video effect modeWhen using effect mode:
  • Only one image is needed (passed through image_urls)
  • No prompt needed (model ignores prompt field)
General effects:
  • squish - Squish and squeeze
  • rotation - Spin around
  • poke - Poke fun
  • inflate - Balloon inflation
  • dissolve - Molecular diffusion
  • melt - Heat wave melt
  • icecream - Ice cream planet
  • flying - Magic levitation
Single person effects:
  • carousel - Time carousel
  • singleheart - Love you
  • dance1 - Swing moment
  • dance2 - Top dance
For more effects, refer to Alibaba Cloud Wanxiang template documentation

Resolution and Aspect Ratio Combinations

Aspect RatioDescription720p Size1080p Size
16:9Landscape (default)1280×7201920×1080
9:16Portrait720×12801080×1920
1:1Square960×9601440×1440
4:3Landscape1088×8321632×1248
3:4Portrait832×10881248×1632

Response

code
integer
Response status code, 200 for success
data
array
Response data array

Usage Scenarios

Scenario 1: Text-to-Video (Simple Request)

{
  "model": "wan2.6",
  "prompt": "A cute cat stretching in the sunshine"
}

Scenario 2: Text-to-Video (Complete Parameters)

{
  "model": "wan2.6",
  "prompt": "A cute cat running on grass",
  "negative_prompt": "blurry, low quality, deformed",
  "aspect_ratio": "16:9",
  "resolution": "720p",
  "duration": 5,
  "seed": 12345,
  "prompt_extend": true,
  "audio": true,
  "shot_type": "single",
  "watermark": false
}

Scenario 3: Image-to-Video

{
  "model": "wan2.6",
  "prompt": "Kitten running on the ground",
  "image_urls": ["https://upload.apimart.ai/f/apimart-models-images/9998233432754770-c059992d-9b01-47d5-810d-ea0502ac9279-image_task_01KD7SSXDBCEWZ869D6PF249ZW_0.png"],
  "resolution": "1080p",
  "duration": 10
}

Scenario 4: Image-to-Video (Base64 Image)

{
  "model": "wan2.6",
  "prompt": "Make the cat stand up and walk",
  "image_urls": ["data:image/png;base64,iVBORw0KGgo..."],
  "duration": 5
}

Mode Description

Text-to-Video

  • Must provide prompt parameter
  • No need for image_urls parameter

Image-to-Video

  • Must provide image_urls parameter (only supports 1 image)
  • prompt parameter is optional, used to describe expected actions
The system automatically selects the mode based on whether image_urls is included in the request
Query Task ResultsVideo generation is an asynchronous task. After submission, a task_id is returned. Use the Get Task Status interface to query generation progress and results.
curl --request POST \
  --url https://toapis.com/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "wan2.6",
    "prompt": "A cute cat running on grass",
    "aspect_ratio": "16:9",
    "resolution": "720p",
    "duration": 5
  }'
{
  "code": 200,
  "data": [
    {
      "status": "submitted",
      "task_id": "task_01J9HA7JPQ9A0Z6JZ3V8M9W6PZ"
    }
  ]
}