AirRAG - A retrieval-augmented generation method leveraging tree search to enhance reasoning in large language models

## Definition of AirRAG AirRAG is a retrieval-augmented generation (RAG) method developed by Alibaba. It aims to enhance the reasoning capabilities of large language models (LLMs) by leveraging tree search, particularly in complex, knowledge-intensive tasks. The method expands the solution space using five core reasoning actions, Monte Carlo Tree Search (MCTS), and self-consistency verification. ## Core Reasoning Actions in AirRAG The five core reasoning actions in AirRAG are: 1. **System Analysis (SAY)**: Analyzes the problem structure, decomposes or plans, reflecting systematic and holistic thinking. 2. **Direct Answer (DA)**: Uses the LLM's parametric knowledge to provide an answer without external information. 3. **Retrieval Answer (RA)**: Retrieves information from an external knowledge base to support reasoning. 4. **Query Transformation (QT)**: Transforms the query (e.g., rewriting, back-prompting, follow-up questions, multi-query) to optimize retrieval. 5. **Summary Answer (SA)**: Combines intermediate steps, answers, and the initial question to generate a final answer. ## Use of Monte Carlo Tree Search in AirRAG AirRAG uses Monte Carlo Tree Search (MCTS) to generate, expand, and backtrack the reasoning process. It employs the Upper Confidence Bound applied to trees (UCT) for node selection, with the formula: `UCT(s, p) = Q(s, a) / N(s) + w * sqrt(log N(p) / N(s))`, where `Q(s, a)` is the reward, `N(s)` is the number of visits, and `w` balances exploration and exploitation. ## Self-Consistency Verification in AirRAG Self-consistency verification in AirRAG involves selecting the optimal answer node using Jaccard similarity (`jcdScore = (1/N) * Σ (|Ai ∩ Aj| / |Ai ∪ Aj|)`) and text embeddings (`embScore = (1/N) * Σ cos(Ei, Ej)`), where `Ei` is the embedding vector. This ensures the consistency and reliability of the selected answer. ## Performance of AirRAG in Specific Tasks AirRAG demonstrates superior performance in complex, knowledge-intensive tasks such as multi-hop question answering (e.g., HotpotQA, MuSiQue, 2WikiMultiHopQA). It outperforms baseline methods, especially in scenarios requiring extended reasoning, such as increasing the number of documents from 5 to 50 or expanding context length from 8k to 128k tokens. ## Key Features of AirRAG The key features of AirRAG include: - Five core reasoning actions: System Analysis, Direct Answer, Retrieval Answer, Query Transformation, and Summary Answer. - Use of Monte Carlo Tree Search (MCTS) for generating, expanding, and backtracking the reasoning process. - Self-consistency verification using Jaccard similarity and text embeddings. - Flexible architecture that allows integration of additional advanced methods. - Superior performance in knowledge-intensive tasks, with accuracy improvements from approximately 65% to 85% in certain scenarios. ## Datasets Used for Evaluating AirRAG AirRAG was evaluated on several datasets, including HotpotQA, MuSiQue, 2WikiMultiHopQA, Natural Questions (NQ), TriviaQA, PopQA, and WebQA. Each dataset included 1,000 test samples with a random seed of 0. ## Main Performance Results of AirRAG The main results of AirRAG's performance are: - **AirRAG-Blender**: F1 score of 69.9, Accuracy of 67.1. - **AirRAG**: F1 score of 68.7, Accuracy of 64.2. - **AirRAG-Lite**: F1 score of 66.0, Accuracy of 61.3. - **IterDRAG***: F1 score of 57.1, Accuracy of 51.4. - **Vanilla RAG**: F1 score of 55.7, Accuracy of 51.0. These results indicate that AirRAG outperforms baseline methods in complex question-answering tasks. ## URL for AirRAG Research Paper The detailed research paper on AirRAG can be accessed at the following URL: [arXiv Paper](https://arxiv.org/pdf/2501.10053). ### Citation sources: - [AirRAG](https://arxiv.org/pdf/2501.10053) - Official URL Updated: 2025-03-28

Register Now

Login

Lost Password

Add question

Login

Register Now

AirRAG - A retrieval-augmented generation method leveraging tree search to enhance reasoning in large language models

AirRAG - A retrieval-augmented generation method leveraging tree search to enhance reasoning in large language models