What information is typically included in each sample of the Babillage Dataset?
Question
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.Morbi adipiscing gravdio, sit amet suscipit risus ultrices eu.Fusce viverra neque at purus laoreet consequa.Vivamus vulputate posuere nisl quis consequat.
Answers ( 1 )
Each sample in the Babillage Dataset typically includes:
- sample_id (unique identifier)
- image_id (for CoOCR-VQA and CoCOCO)
- Question Audio (duration and content)
- Question Transcript
- Question Alignment (time alignment sequence)
- Answer Audio (duration and content)
- Answer Transcript
- Answer Alignment (time alignment sequence)