Tongyi Wanxiang Wan2.1 - An open-source video generation model developed by Alibaba's Tongyi Lab.
## Definition of Wan2.1
Tongyi Wanxiang Wan2.1 is an open-source video generation model developed by Alibaba's Tongyi Lab. It includes four models: two text-to-video (T2V) models (1.3B and 14B parameters) and two image-to-video (I2V) models (both 14B parameters), supporting resolutions of 480P and 720P. The model is designed for high-quality video generation, with capabilities such as complex motion handling, multilingual text effects, and efficient encoding.
## Unique Features of Wan2.1
Wan2.1 offers the following key features:
- **High-quality video generation**: Produces realistic visuals with adherence to physical laws.
- **Complex motion handling**: Supports smooth and consistent motion in dynamic scenes like sports.
- **Multilingual text effects**: Generates advanced Chinese and English text effects for creative applications.
- **Efficient encoding**: Uses a custom VAE and DiT architecture, reducing memory usage by 29% and improving speed.
- **Physical law simulation**: Accurately simulates collisions, rebounds, and other physical interactions.
- **Long-context training**: Ensures high consistency between text prompts and generated videos.
## Unique Features of Wan2.1
In the VBench evaluation, Wan2.1 achieved a total score of 86.22%, ranking first among competing models like OpenAI's Sora, Minimax, Luma, Gen3, and Pika. Its performance highlights its superiority in video generation quality, motion handling, and text-video alignment.
## Model Specifications and Resolutions
Wan2.1 includes the following models:
- **T2V-14B**: 14B parameters, supports 480P/720P resolutions, optimized for complex motion.
- **T2V-1.3B**: 1.3B parameters, supports 480P (720P less stable), runs on consumer-grade GPUs (8.19GB VRAM).
- **I2V-14B-720P**: 14B parameters, 720P resolution, for high-resolution image-to-video generation.
- **I2V-14B-480P**: 14B parameters, 480P resolution, for efficient image-to-video generation.
## Installation and Usage of Wan2.1
To install and use Wan2.1:
1. **Clone the repository**: `git clone https://github.com/Wan-Video/Wan2.1.git`.
2. **Install dependencies**: Run `pip install -r requirements.txt` (requires torch >= 2.4.0).
3. **Download models**: Obtain models from Hugging Face (e.g., [T2V-14B](https://huggingface.co/Wan-AI/Wan2.1-T2V-14B)) or ModelScope.
4. **Run examples**:
- Text-to-video: `python generate.py --task t2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-T2V-14B --prompt "Your prompt here"`.
- Image-to-video: `python generate.py --task i2v-14B --size 1280*720 --ckpt_dir ./Wan2.1-I2V-14B-720P --image input_image.JPG --prompt "Your prompt here"`.
5. **Community support**: Join [Discord](https://discord.gg/AKNgpMK4Yj) or the WeChat group for assistance.
## Applications of Wanxiang Wan 2.1
Wan2.1 is suitable for:
- **Content creation**: Generating short videos with artistic styles (e.g., oil painting, cyberpunk).
- **Advertising**: Designing dynamic ads with personalized text effects.
- **Education**: Creating immersive educational videos with dynamic demonstrations.
- **Film**: Producing cinematic scenes with complex motion and physics.
- **Gaming**: Generating virtual environments for game development.
## Access Points for Wan2.1
Users can access the following resources:
- **GitHub repository**: [Wan-Video/Wan2.1](https://github.com/Wan-Video/Wan2.1).
- **Model downloads**: [Hugging Face](https://huggingface.co/Wan-AI/Wan2.1-T2V-14B) or [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-14B).
- **Official platform**: [Wan Video](https://wan.video/).
- **Community support**: [Discord](https://discord.gg/AKNgpMK4Yj) or WeChat group.
### Citation sources:
- [Tongyi Wanxiang Wan2.1](https://github.com/Wan-Video/Wan2.1) - Official URL
Updated: 2025-04-01