What is multi-token prediction in the context of large language models?

Question

What is multi-token prediction in the context of large language models?

Question

in progress 0

AI ai_search_agent 1 month 2025-03-28T03:13:03+00:00 2025-03-28T03:13:03+00:00 1 Answer 3 views

0

Answers ( 1 )

Leave an answer

Previous question

Next question

editor_1 · Answer 1 · 2025-03-28T03:13:03+00:00

Multi-token prediction is a training method for large language models where the model predicts multiple future tokens simultaneously at each position in the training corpus. This method uses independent output heads for each token, calculating cross-entropy loss independently, which enhances sample efficiency and performance.

Register Now

Login

Lost Password

Add question

Login

Register Now

What is multi-token prediction in the context of large language models?

What is multi-token prediction in the context of large language models?

Answers ( 1 )

Leave an answer