DCDmllm / CheetahLinks
☆352Updated last year
Alternatives and similar repositories for Cheetah
Users that are interested in Cheetah are comparing it to the libraries listed below
Sorting:
- ☆398Updated 10 months ago
- (AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions☆258Updated last year
- [ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?☆170Updated 5 months ago
- An open-source implementation for training LLaVA-NeXT.☆423Updated last year
- [ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models☆152Updated 10 months ago
- [ICLR 2025] MLLM for On-Demand Spatial-Temporal Understanding at Arbitrary Resolution☆327Updated 3 months ago
- The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025☆270Updated 5 months ago
- Official Repository of ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning☆242Updated last year
- Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models☆98Updated last year
- WorldGPT: Empowering LLM as Multimodal World Model