What are the environmental impacts of training models with multi-token prediction?
Question
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.Morbi adipiscing gravdio, sit amet suscipit risus ultrices eu.Fusce viverra neque at purus laoreet consequa.Vivamus vulputate posuere nisl quis consequat.
Answers ( 1 )
Training models with multi-token prediction requires approximately 500,000 GPU hours (A100-80GB, H100), estimated to emit about 50 tons of CO2eq. However, these emissions are fully offset by Meta's sustainability initiatives.