HJYao00 / Mulberry
☆281Updated this week
Alternatives and similar repositories for Mulberry:
Users that are interested in Mulberry are comparing it to the libraries listed below
- Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models☆165Updated 2 months ago
- The repository for the paper titled "Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks"☆151Updated last month
- The Official Repo of ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://a…☆286Updated 2 months ago
- [ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?☆151Updated 4 months ago
- (AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions☆251Updated 9 months ago
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models☆91Updated 6 months ago
- Align Anything: Training All-modality Model with Feedback☆938Updated this week
- The official implementation of Self-Play Preference Optimization (SPPO)☆471Updated last week
- Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasonin…☆154Updated last month
- A curated list of resources on graph-based retrieval-augmented generation (GraphRAG) for customized large language models.☆300Updated this week
- Unified KV Cache Compression Methods for Auto-Regressive Models☆854Updated 3 weeks ago
- AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning (NeurIPS 2024)☆172Updated 2 weeks ago
- DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models☆126Updated 2 weeks ago
- MPLSandbox is an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler a…☆167Updated 2 months ago
- [ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models☆27Updated last week
- Benchmarking LLMs via Uncertainty Quantification☆205Updated last year
- [NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy☆61Updated last week
- Official Repository of ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning☆207Updated 4 months ago
- Improving Generalist Model with Domain-Specific Experts☆79Updated 3 weeks ago
- The official code for "Olympus: A Universal Task Router for Computer Vision Tasks"☆46Updated last month
- Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models☆89Updated 10 months ago
- An open-source implementation for training LLaVA-NeXT.☆375Updated 3 months ago
- The official implementation of the ICML 2024 paper "MemoryLLM: Towards Self-Updatable Large Language Models"☆110Updated 3 weeks ago
- [NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation☆90Updated 3 months ago
- Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition☆271Updated 3 weeks ago
- A curated list of awesome leaderboard-oriented resources for foundation models☆251Updated 3 weeks ago
- A Contamination-free Multi-task Language Understanding Benchmark☆113Updated 3 weeks ago
- (ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator☆104Updated 3 months ago
- [ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization☆531Updated 7 months ago