What is Florence-2-large?
Question
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.Morbi adipiscing gravdio, sit amet suscipit risus ultrices eu.Fusce viverra neque at purus laoreet consequa.Vivamus vulputate posuere nisl quis consequat.
Answers ( 4 )
Florence-2-large is a visual language model developed by Microsoft. It is designed to handle a variety of computer vision and visual language tasks using a prompt-based approach. The model employs a sequence-to-sequence learning paradigm and is trained on the FLD-5B dataset, which contains 126 million images and 5.4 billion comprehensive visual annotations. Florence-2-large excels in tasks such as caption generation, object detection, visual grounding, visual segmentation, and OCR, leveraging multi-task learning for unified visual understanding.
Florence-2-large supports a variety of tasks, including caption generation, object detection, visual grounding, visual segmentation, and OCR. The model is capable of interpreting simple text prompts to perform these tasks, making it versatile for a wide range of computer vision applications.
Florence-2-large is trained on the FLD-5B dataset, which contains 126 million images and 5.4 billion comprehensive visual annotations. This large-scale dataset enables the model to handle complex visual data, such as object locations, mask contours, and attributes, effectively.
Florence-2-large employs a sequence-to-sequence architecture, which enhances its flexibility in handling various visual and visual language tasks. This architecture allows the model to perform well in both zero-shot and fine-tuned settings, making it a competitive visual foundation model.