Which hardware is used for training DeepSeek models?

Question

Answers ( 1 )

    0
    2025-04-01T11:50:18+00:00

    DeepSeek models are trained using:
    - **GPUs**: Nvidia A100 and H800.
    - **Clusters**: Fire-Flyer 2 (625 nodes, 5000 PCIe A100 GPUs, upgraded with NVLinks).
    - **Interconnect**: 200 Gbps for optimized distributed training.

Leave an answer