TheoremExplainAgent - An AI-powered educational tool that converts complex theorems into animated videos using a dual-agent architecture.
## Purpose of TheoremExplainAgent
TheoremExplainAgent is designed to automate the conversion of complex mathematical and scientific theorems into educational animated videos. It aims to enhance understanding of STEM subjects through multimodal explanations (combining text and animation), particularly in diagnosing reasoning flaws in large language models (LLMs).
## Dual-Agent Architecture of TheoremExplainAgent
TheoremExplainAgent uses a dual-agent architecture:
1. **Planning Agent**: Responsible for creating story plans and narrations.
2. **Coding Agent**: Generates Python animation scripts using Manim (a mathematical animation library).
## Subject Coverage of TheoremExplainAgent
TheoremExplainAgent covers multiple STEM disciplines, including:
- Mathematics
- Physics
- Computer Science
- Chemistry
## TheoremExplainBench Evaluation Framework
TheoremExplainBench (TEB) is a benchmark dataset developed alongside TheoremExplainAgent to evaluate LLM performance in theorem explanation. It contains:
- 240 theorems (80 easy, 80 medium, 80 hard)
- Coverage of 68 subfields
- Data sourced from OpenStax and LibreTexts
It assesses five dimensions: accuracy, depth, logical flow, visual relevance, and element layout.
## Performance of o3-mini Agent
The o3-mini agent achieves:
- 93.8% success rate in generating complete videos
- 0.77 overall evaluation score (close to human-made Manim videos at 0.77)
It outperforms other tested models like GPT-4o (55.0%) and Claude 3.5-Sonnet (2.1%).
## Animation Technology in TheoremExplainAgent
TheoremExplainAgent uses:
- **Manim**: A Python-based mathematical animation library
- Python scripts generated by the coding agent to create precise visualizations of theorem proofs
## Accessing TheoremExplainAgent Resources
Key resources are available at:
- [Project Website](https://tiger-ai-lab.github.io/TheoremExplainAgent/)
- [GitHub Repository](https://github.com/TIGER-AI-Lab/TheoremExplainAgent) (MIT licensed, partially open-sourced)
- [arXiv Paper](https://arxiv.org/abs/2502.19400) (Published Feb 26, 2025)
- [TheoremExplainBench Dataset](https://huggingface.co/datasets/TIGER-Lab/TheoremExplainBench) on Hugging Face
## Multimodal Explanations in TheoremExplainAgent
TheoremExplainAgent combines:
1. **Textual explanations**: Generated by the planning agent (narrations and story plans)
2. **Visual animations**: Created by the coding agent using Manim
This combination proves more effective than text-only methods, especially in exposing reasoning errors in LLMs.
## Comparison with Human-Made Content
Evaluation shows:
- Human-made Manim videos score 0.77 overall
- o3-mini agent scores 0.77 (matching human performance)
- Other LLMs like GPT-4o score 0.78 but have lower success rates (55.0%)
The system approaches human-level quality in logical flow (0.89 vs. 0.70) but trails slightly in element layout (0.61 vs. 0.73).
### Citation sources:
- [TheoremExplainAgent](https://tiger-ai-lab.github.io/TheoremExplainAgent) - Official URL
Updated: 2025-04-01