Register Now

Login

Lost Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Captcha Click on image to update the captcha .

Add question

You must login to ask a question.

Login

Register Now

Lorem ipsum dolor sit amet, consectetur adipiscing elit.Morbi adipiscing gravdio, sit amet suscipit risus ultrices eu.Fusce viverra neque at purus laoreet consequa.Vivamus vulputate posuere nisl quis consequat.

Janus-Pro - A multimodal understanding and generation framework excelling in text-to-image tasks.

## Janus-Pro Model Variants Janus-Pro includes two primary variants: - **Janus-Pro-1B**: Based on DeepSeek-LLM-1.5b-base - **Janus-Pro-7B**: Based on DeepSeek-LLM-7b-base Both support 4096-token sequence lengths and multimodal tasks. ## Benchmark Performance Comparison Janus-Pro surpasses both DALL-E 3 and Stable Diffusion in: - **GenEval** and **DPG-Bench** benchmark tests - Text-to-image generation stability - Multimodal understanding capabilities, such as converting math formulas to LaTeX. ## Visual Encoding Technology Janus-Pro employs **SigLIP-L (ViT-L-16-SigLIP-384)** as its visual encoder, which: - Supports **384×384 resolution** image inputs - Uses decoupled visual encoding to separate understanding/generation roles - Enhances performance through 16× downsampling. ## Model Availability Models are available at: - **Hugging Face**: - [Janus-Pro-1B](https://huggingface.co/deepseek-ai/Janus-Pro-1B) - [Janus-Pro-7B](https://huggingface.co/deepseek-ai/Janus-Pro-7B) - **GitHub**: [Installation guide](https://github.com/deepseek-ai/Janus) with Python ≥3.8 requirements. ## Technical Features Key innovations include: 1. **Decoupled visual encoding** for reduced task conflict 2. **Unified transformer architecture** across model sizes 3. **Specialized tokenizer** from FoundationVision/LlamaGen (16× downsampling) 4. Optimized training strategies and expanded datasets. ## Online Demonstration Options Online demos include: - [Janus-Pro-7B Gradio demo](https://huggingface.co/spaces/deepseek-ai/Janus-Pro-7B) - FastAPI-based testing via `python demo/fastapi_client.py`. ## Licenses for Janus-Pro-7B - **Code**: MIT License ([LICENSE-CODE](https://github.com/deepseek-ai/DeepSeek-LLM/blob/HEAD/LICENSE-CODE)) - **Models**: DeepSeek Model License ([LICENSE-MODEL](https://github.com/deepseek-ai/DeepSeek-LLM/blob/HEAD/LICENSE-MODEL)). ## Research Publications Key references: - Janus-Pro Tech Report: [janus_pro_tech_report.pdf](https://github.com/deepseek-ai/Janus/blob/main/janus_pro_tech_report.pdf) - Janus (base model): [arXiv:2410.13848](https://arxiv.org/abs/2410.13848) - JanusFlow: [arXiv:2411.07975](https://arxiv.org/abs/2411.07975). ### Citation sources: - [Janus-Pro](https://hf-mirror.com/deepseek-ai/Janus-Pro-7B) - Official URL Updated: 2025-04-01