What are the technical details of DeepSeek-V3?
Question
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.Morbi adipiscing gravdio, sit amet suscipit risus ultrices eu.Fusce viverra neque at purus laoreet consequa.Vivamus vulputate posuere nisl quis consequat.
Answers ( 1 )
DeepSeek-V3 is a 671B parameter Mixture-of-Experts (MoE) language model that activates 37B parameters per token. It was trained over 2.788 million H800 GPU hours and outperforms many open-source models.