Janus-Pro - A multimodal understanding and generation framework excelling in text-to-image tasks.
## Janus-Pro Model Variants
Janus-Pro includes two primary variants:
- **Janus-Pro-1B**: Based on DeepSeek-LLM-1.5b-base
- **Janus-Pro-7B**: Based on DeepSeek-LLM-7b-base
Both support 4096-token sequence lengths and multimodal tasks.
## Benchmark Performance Comparison
Janus-Pro surpasses both DALL-E 3 and Stable Diffusion in:
- **GenEval** and **DPG-Bench** benchmark tests
- Text-to-image generation stability
- Multimodal understanding capabilities, such as converting math formulas to LaTeX.
## Visual Encoding Technology
Janus-Pro employs **SigLIP-L (ViT-L-16-SigLIP-384)** as its visual encoder, which:
- Supports **384×384 resolution** image inputs
- Uses decoupled visual encoding to separate understanding/generation roles
- Enhances performance through 16× downsampling.
## Model Availability
Models are available at:
- **Hugging Face**:
- [Janus-Pro-1B](https://huggingface.co/deepseek-ai/Janus-Pro-1B)
- [Janus-Pro-7B](https://huggingface.co/deepseek-ai/Janus-Pro-7B)
- **GitHub**: [Installation guide](https://github.com/deepseek-ai/Janus) with Python ≥3.8 requirements.
## Technical Features
Key innovations include:
1. **Decoupled visual encoding** for reduced task conflict
2. **Unified transformer architecture** across model sizes
3. **Specialized tokenizer** from FoundationVision/LlamaGen (16× downsampling)
4. Optimized training strategies and expanded datasets.
## Online Demonstration Options
Online demos include:
- [Janus-Pro-7B Gradio demo](https://huggingface.co/spaces/deepseek-ai/Janus-Pro-7B)
- FastAPI-based testing via `python demo/fastapi_client.py`.
## Licenses for Janus-Pro-7B
- **Code**: MIT License ([LICENSE-CODE](https://github.com/deepseek-ai/DeepSeek-LLM/blob/HEAD/LICENSE-CODE))
- **Models**: DeepSeek Model License ([LICENSE-MODEL](https://github.com/deepseek-ai/DeepSeek-LLM/blob/HEAD/LICENSE-MODEL)).
## Research Publications
Key references:
- Janus-Pro Tech Report: [janus_pro_tech_report.pdf](https://github.com/deepseek-ai/Janus/blob/main/janus_pro_tech_report.pdf)
- Janus (base model): [arXiv:2410.13848](https://arxiv.org/abs/2410.13848)
- JanusFlow: [arXiv:2411.07975](https://arxiv.org/abs/2411.07975).
### Citation sources:
- [Janus-Pro](https://hf-mirror.com/deepseek-ai/Janus-Pro-7B) - Official URL
Updated: 2025-04-01