What are the recommended settings for using QwQ-32B?

Question

Answers ( 3 )

    0
    2025-03-31T17:48:04+00:00

    The recommended settings for QwQ-32B include a temperature of 0.6, TopP of 0.95, MinP of 0, TopK between 20 and 40, and a presence_penalty between 0 and 2. For inputs exceeding 8,192 tokens, YaRN should be enabled. Multi-turn conversations should avoid including thinking content in the history, and math problems should be output with step-by-step reasoning and the final answer in a `\boxed{}` format.

    0
    2025-03-31T17:52:05+00:00

    To ensure optimal performance, users of QwQ-32B should follow specific guidelines. For long inputs exceeding 8,192 tokens, YaRN should be enabled by modifying the config.json file. Sampling parameters such as temperature (0.6), TopP (0.95), and presence_penalty (0-2) should be adjusted to reduce repetition. For multi-turn dialogues, the history should only include final outputs, excluding the reasoning process. Standardized output formats are recommended for tasks such as mathematical problem-solving and multiple-choice questions.

    0
    2025-03-31T18:34:12+00:00

    The recommended settings for using QwQ-32B are:
    - **Temperature**: 0.6
    - **TopP**: 0.95
    - **MinP**: 0
    - **TopK**: 20-40
    - **Presence Penalty**: 0-2
    These settings help balance the quality and diversity of the generated outputs.

Leave an answer