sunsmarterjie / ChatterBoxLinks
[AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues
☆58Updated 6 months ago
Alternatives and similar repositories for ChatterBox
Users that are interested in ChatterBox are comparing it to the libraries listed below
Sorting:
- ☆123Updated last year
- [NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context☆168Updated last year
- ✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models☆162Updated 10 months ago
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆61Updated 3 months ago
- MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment☆35Updated last year
- [EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆60Updated 2 months ago
- [ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆89Updated 5 months ago
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…