[ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models
☆155Dec 5, 2024Updated last year
Alternatives and similar repositories for MAVIS
Users that are interested in MAVIS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?☆181Apr 28, 2025Updated last year
- MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models☆32Jan 22, 2025Updated last year
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆91Jun 28, 2024Updated last year
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- Official repository for "TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving"☆23Sep 1, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [MM 2025] CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models☆55Oct 20, 2024Updated last year
- CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms☆25Dec 21, 2025Updated 5 months ago
- Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step☆160Jul 28, 2025Updated 10 months ago
- Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models☆205Nov 4, 2024Updated last year
- The Most Faithful Implementation of Segment Anything (SAM) in 3D☆356Sep 11, 2024Updated last year
- Paper collections of multi-modal LLM for Math/STEM/Code.☆143May 17, 2026Updated 3 weeks ago
- ☆20May 14, 2024Updated 2 years ago
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆125Nov 25, 2024Updated last year
- Official github repo of G-LLaVA☆150Feb 20, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆157Oct 31, 2024Updated last year
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆106Sep 19, 2025Updated 8 months ago
- A Self-Training Framework for Vision-Language Reasoning☆89Jan 23, 2025Updated last year
- MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts☆363Sep 29, 2025Updated 8 months ago
- ☆17Jan 9, 2025Updated last year
- One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks☆4,201Jun 5, 2026Updated last week
- 此工程为唯杰地图 VJMAP3D 示例的所有源 代码。唯杰地图3D VJMAP3D是一款基于threejs开发的三维可视化引擎框架。通过VJMAP3D提供的丰富的功能,可以在浏览器中创建出绚丽的3D可视化应用。 该框架既可做为一个单独的3D引擎用于数据可视化、产品展示、数字…☆46Mar 11, 2026Updated 3 months ago
- Deep Reinforcement Learning Algorithms for solving Atari 2600 Games☆143Mar 23, 2023Updated 3 years ago
- Collaborative caching for HTTP video streaming☆38Aug 13, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 「ECCV 2024」 PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation☆22Jul 2, 2024Updated last year
- Mavlink based attacker for GPS,Actuator or other sensors. SITL Environment is based on PX4,Gazebo,ROS And QGC.☆19Aug 16, 2025Updated 9 months ago
- [ICLR'25] Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training☆49Jan 25, 2025Updated last year
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆89Jun 17, 2024Updated last year
- [ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models☆22May 28, 2024Updated 2 years ago
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …☆445Dec 22, 2024Updated last year
- [NeurIPS 2024] MATH-Vision dataset and code to measure multimodal mathematical reasoning capabilities.☆138May 16, 2025Updated last year
- [ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization☆586Jun 7, 2024Updated 2 years ago
- ☆43Dec 21, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks☆4,218Updated this week
- OGtwelve's util pack: contains many different util might used in real life develop situation☆111Dec 30, 2023Updated 2 years ago
- The implement of geometric solver PGPSNet☆30Jan 30, 2025Updated last year
- Cambrian-1 is a family of multimodal LLMs with a vision-centric design.☆2,003Nov 7, 2025Updated 7 months ago
- ☆153Jul 28, 2022Updated 3 years ago
- A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models☆30Nov 25, 2024Updated last year
- Welcome to the 'Open-Alteryx-Macro' project. This project is aimed at providing an open-source solution for managing and updating Alteryx…☆156May 25, 2024Updated 2 years ago