What datasets are used in the LLaVA-OneVision project?
Question
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.Morbi adipiscing gravdio, sit amet suscipit risus ultrices eu.Fusce viverra neque at purus laoreet consequa.Vivamus vulputate posuere nisl quis consequat.
Answers ( 1 )
The LLaVA-OneVision project uses a large dataset that includes 3.2M single-image samples, 1.6M multi-image and video samples, and high-quality synthetic data (e.g., 4M high-quality knowledge data). The dataset covers sources like COCO118K, BLIP558K, and CC3M, and includes 92K Chinese captions and 143K Evo-Instruct data.