[ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models
☆153Dec 5, 2024Updated last year
Alternatives and similar repositories for MAVIS
Users that are interested in MAVIS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?☆177Apr 28, 2025Updated last year
- MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models☆32Jan 22, 2025Updated last year
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆91Jun 28, 2024Updated last year
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- Official repository for "TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving"☆23Sep 1, 2025Updated 8 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [MM 2025] CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models☆55Oct 20, 2024Updated last year
- CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms☆25Dec 21, 2025Updated 4 months ago
- Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models☆186Nov 4, 2024Updated last year
- Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step☆160Jul 28, 2025Updated 9 months ago
- The Most Faithful Implementation of Segment Anything (SAM) in 3D☆356Sep 11, 2024Updated last year
- Paper collections of multi-modal LLM for Math/STEM/Code.☆139Nov 17, 2025Updated 5 months ago
- ☆20May 14, 2024Updated last year
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆125Nov 25, 2024Updated last year
- Official github repo of G-LLaVA☆148Feb 20, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆157Oct 31, 2024Updated last year
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆106Sep 19, 2025Updated 7 months ago
- A Self-Training Framework for Vision-Language Reasoning☆90Jan 23, 2025Updated last year
- MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts☆358Sep 29, 2025Updated 7 months ago
- ☆17Jan 9, 2025Updated last year
- One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks☆4,085Updated this week
- 此工程为唯杰地图 VJMAP3D 示例的所有源代码。唯杰地图3D VJMAP3D是一款基于threejs开发的三维可视化引擎框架。通过VJMAP3D提供的丰富的功能,可以在浏览器中创建出绚丽的3D可视化应用。 该框架既可做为一个单独的3D引擎用于数据可视化、产品展示、数字…☆46Mar 11, 2026Updated last month
- Deep Reinforcement Learning Algorithms for solving Atari 2600 Games☆143Mar 23, 2023Updated 3 years ago
- Collaborative caching for HTTP video streaming☆38Aug 13, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- 「ECCV 2024」 PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation☆22Jul 2, 2024Updated last year
- Mavlink based attacker for GPS,Actuator or other sensors. SITL Environment is based on PX4,Gazebo,ROS And QGC.☆19Aug 16, 2025Updated 8 months ago
- [ICLR'25] Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training☆48Jan 25, 2025Updated last year
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆85Jun 17, 2024Updated last year
- [ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models☆22May 28, 2024Updated last year
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …☆443Dec 22, 2024Updated last year
- [NeurIPS 2024] MATH-Vision dataset and code to measure multimodal mathematical reasoning capabilities.☆135May 16, 2025Updated 11 months ago
- [ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization☆587Jun 7, 2024Updated last year
- ☆43Dec 21, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks☆4,096Updated this week
- OGtwelve's util pack: contains many different util might used in real life develop situation☆111Dec 30, 2023Updated 2 years ago
- The implement of geometric solver PGPSNet☆30Jan 30, 2025Updated last year
- Cambrian-1 is a family of multimodal LLMs with a vision-centric design.☆1,996Nov 7, 2025Updated 5 months ago
- ☆153Jul 28, 2022Updated 3 years ago
- A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models☆30Nov 25, 2024Updated last year
- Welcome to the 'Open-Alteryx-Macro' project. This project is aimed at providing an open-source solution for managing and updating Alteryx…☆156May 25, 2024Updated last year