"What benchmarks has Claude 3.7 Sonnet been evaluated against?"

Question

Answers ( 1 )

    0
    2025-03-26T17:46:52+00:00

    Claude 3.7 Sonnet has been evaluated against **SWE-bench Verified** and **TAU-bench**, achieving state-of-the-art results. It also outperformed previous models in **Pokémon gameplay tests**.

Leave an answer