LM Arena - An open platform for evaluating AI through human preference-based crowdsourced benchmarking.
## Purpose of LM Arena
The primary purpose of LM Arena is to evaluate AI models through human preference-based crowdsourced benchmarking, allowing users to compare large language models (LLMs) by voting on their responses and updating a leaderboard.
## Model Comparison in LM Arena
LM Arena allows users to select two AI models, input a prompt, and generate responses from both models. Users can then compare the responses and vote on which model performed better, contributing to the leaderboard's updates.
## Ranking System in LM Arena
LM Arena uses the Elo rating system, similar to international chess rankings, to dynamically update the leaderboard based on user votes, ensuring a scientific and competitive ranking of AI models.
## Model Comparison in LM Arena
The key features of LM Arena include free access for testing and comparing AI models, the ability to select and compare two models, a prompt input and response generation system, a voting mechanism for user feedback, and an Elo-based leaderboard for ranking models.
## Model Comparison in LM Arena
Users can participate in LM Arena by visiting the platform, selecting two AI models to compare, inputting a prompt, reviewing the generated responses, voting on the better response, and contributing to the leaderboard updates.
## Historical Background of LM Arena
LM Arena is believed to be a rebranding or continuation of the Chatbot Arena project, previously associated with the Large Model Systems Organization (LMSYS Org). The platform now operates primarily under the [lmarena.ai](https://lmarena.ai/) domain, though some historical and organizational controversies remain.
## Data Collection in LM Arena
LM Arena collects data through anonymous user-paired battles, where users vote on the better response to a prompt. This method ensures anonymity and generates a large dataset of human preferences, which is used to update the leaderboard.
## Supported AI Models in LM Arena
LM Arena supports a variety of popular large language models (LLMs), including GPT-4, Llama 3, and others, allowing users to compare both commercial and open-source models.
## Ranking System in LM Arena
The Elo rating system in LM Arena ensures a dynamic and scientifically rigorous ranking of AI models based on user votes, emphasizing the practical utility of models in real-world scenarios.
## Key URLs for LM Arena
The key URLs associated with LM Arena include the main platform at [lmarena.ai](https://lmarena.ai/), the LMSYS Org blog at [lmsys.org/blog](https://lmsys.org/blog/2023-05-03-arena/), and the Hugging Face space for the leaderboard at [Hugging Face](https://huggingface.co/spaces/lmarena-ai/chatbot-arena-leaderboard).
### Citation sources:
- [LM Arena](https://lmarena.ai) - Official URL
Updated: 2025-03-31