Wan2.6 Video Generation

Alibaba Cloud Wanxiang video generation model
Supports three modes: Text-to-Video, Image-to-Video, and Reference Video (r2v)
Server automatically routes to the appropriate upstream model based on your request parameters
Supports 720p/1080p resolution, 5/10/15 second duration
Audio is always included in the generated output

Important Change: For better performance and cost control, we no longer support passing base64 image data directly in image_urls. Please use the Upload Image API first to upload images and get URLs, then call this endpoint.

Routing Logic

The server automatically selects the upstream model based on what parameters you provide:

Parameters Provided	Upstream Mode
`metadata.reference_urls` (video URLs)	Reference Video (r2v)
`image_urls` (image)	Image-to-Video (i2v)
`prompt` only	Text-to-Video (t2v)

Authentication

Authorization

string

required

All endpoints require Bearer Token authenticationGet your API Key from the API Key Management PageAdd to request header:

Authorization: Bearer YOUR_API_KEY

Request Parameters

model

string

required

Video generation model name, fixed as wan2.6

prompt

string

required

Video content descriptionRequired for text-to-video mode; optional for image-to-video and reference video modes (describe expected motion or style)Example: "A cute cat stretching in the sunshine"

image_urls

string[]

Reference image URL array for image-to-video mode (only supports 1 image)⚠️ URL format only (base64 no longer supported)

Publicly accessible image URL (http:// or https://)
You can use the Upload Image API to upload local images and get URLs

Example: ["https://example.com/image.jpg"]Note: Cannot be combined with metadata.reference_urls

aspect_ratio

string

default:"16:9"

Video aspect ratio (applies to text-to-video and reference video modes)Available values:

16:9 - Landscape (default)
9:16 - Portrait
1:1 - Square
4:3 - Landscape
3:4 - Portrait

Default: 16:9Note: Not supported in image-to-video mode

resolution

string

default:"1080p"

Video resolutionAvailable values:

720p - Standard definition
1080p - High definition (default)

Default: 1080p480p is not supported. Billed per second; different resolutions have different prices — refer to the model marketplace for details.

duration

integer

default:"5"

Video duration (seconds)Supported values: 5, 10, 15Default: 5

negative_prompt

string

Negative prompt — describe content you do not want in the videoExample: "blurry, low quality, deformed"

seed

integer

Random seed for reproducibilityExample: 12345

prompt_extend

boolean

default:"true"

Auto-extend promptWhen enabled, the system automatically optimizes and enriches your prompt. Enabled by default — set to false to disable.

audio

boolean

Include audio in the generated videoNon-flash Wan2.6 models always include audio by default. Set to true to explicitly enable.

shot_type

string

Shot type (applies to text-to-video and reference video modes)Available values:

single - Single continuous shot
multi - Multi-shot (cinematic cuts)

watermark

boolean

Add an Alibaba Cloud watermark to the generated video

metadata

object

Extended parameters

Show Show metadata fields

reference_urls

string[]

Reference Video (r2v) mode — array of reference video URLsWhen provided, the server routes to the reference video upstream model (wan2.6-r2v). The model uses these videos to guide generation of consistent characters or scenes.

Each entry must be a publicly accessible video URL (http:// or https://)
Up to several reference videos are supported

Example: ["https://cdn.example.com/ref1.mp4"]Note: Cannot be combined with image_urls

Resolution and Aspect Ratio Combinations

Aspect Ratio	Description	720p Size	1080p Size
16:9	Landscape (default)	1280×720	1920×1080
9:16	Portrait	720×1280	1080×1920
1:1	Square	960×960	1440×1440
4:3	Landscape	1088×832	1632×1248
3:4	Portrait	832×1088	1248×1632

Response

string

Unique task identifier for status queries

object

string

Object type, always generation.task

model

string

Model name used

status

string

Task status

queued - Queued for processing
in_progress - Processing
completed - Successfully completed
failed - Failed

progress

integer

Task progress percentage (0-100)

created_at

integer

Task creation timestamp (Unix timestamp)

metadata

object

Task metadata

Usage Scenarios

Scenario 1: Text-to-Video

{
  "model": "wan2.6",
  "prompt": "A cute cat running on grass, sunny day",
  "aspect_ratio": "16:9",
  "resolution": "1080p",
  "duration": 5
}

Scenario 2: Image-to-Video

{
  "model": "wan2.6",
  "prompt": "The cat starts running playfully",
  "image_urls": ["https://example.com/cat.jpg"],
  "resolution": "1080p",
  "duration": 10
}

Scenario 3: Reference Video (r2v)

{
  "model": "wan2.6",
  "prompt": "The character waves and smiles at the camera",
  "metadata": {
    "reference_urls": ["https://cdn.example.com/ref-character.mp4"]
  },
  "shot_type": "single",
  "resolution": "1080p",
  "duration": 5
}

Scenario 4: Text-to-Video (Full Parameters)

{
  "model": "wan2.6",
  "prompt": "A golden retriever runs through a field of sunflowers",
  "negative_prompt": "blurry, low quality, deformed",
  "aspect_ratio": "16:9",
  "resolution": "1080p",
  "duration": 10,
  "seed": 12345,
  "prompt_extend": true,
  "shot_type": "multi",
  "watermark": false
}

Querying Task ResultsVideo generation is an asynchronous task. After submission, a task_id is returned. Use the Get Task Status interface to query generation progress and results.

curl --request POST \
  --url https://toapis.com/v1/videos/generations \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "wan2.6",
    "prompt": "A cute cat running on grass",
    "aspect_ratio": "16:9",
    "resolution": "1080p",
    "duration": 5
  }'

{
  "id": "video_01J9HA7JPQ9A0Z6JZ3V8M9W6PZ",
  "object": "generation.task",
  "model": "wan2.6",
  "status": "queued",
  "progress": 0,
  "created_at": 1768380224,
  "metadata": {
    "aspect_ratio": "16:9"
  }
}

Overview

Quick Start

Chat API

Image API

Video API

Task Management

File Uploads

Account

Routing Logic

Authentication

Request Parameters

Resolution and Aspect Ratio Combinations

Response

Usage Scenarios

Scenario 1: Text-to-Video

Scenario 2: Image-to-Video

Scenario 3: Reference Video (r2v)

Scenario 4: Text-to-Video (Full Parameters)

Overview

Quick Start

Chat API

Image API

Video API

Task Management

File Uploads

Account

​Routing Logic

​Authentication

​Request Parameters

​Resolution and Aspect Ratio Combinations

​Response

​Usage Scenarios

​Scenario 1: Text-to-Video

​Scenario 2: Image-to-Video

​Scenario 3: Reference Video (r2v)

​Scenario 4: Text-to-Video (Full Parameters)

Routing Logic

Authentication

Request Parameters

Resolution and Aspect Ratio Combinations

Response

Usage Scenarios

Scenario 1: Text-to-Video

Scenario 2: Image-to-Video

Scenario 3: Reference Video (r2v)

Scenario 4: Text-to-Video (Full Parameters)