Code for the Molmo2 Vision-Language Model
☆539Mar 18, 2026Updated 2 months ago
Alternatives and similar repositories for molmo2
Users that are interested in molmo2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2026] Scene Graph Driven Data Synthesis for Visual Generation Training☆67Apr 22, 2026Updated 3 weeks ago
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams☆99Mar 15, 2026Updated 2 months ago
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Models☆30Mar 18, 2026Updated 2 months ago
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- More reliable Video Understanding Evaluation☆15Sep 23, 2025Updated 7 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆72Nov 27, 2024Updated last year
- A instruction data generation system for multimodal language models.☆37Jan 31, 2025Updated last year
- ☆12Dec 6, 2024Updated last year
- Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence☆351May 13, 2026Updated last week
- ☆16Sep 11, 2025Updated 8 months ago
- [WACV 2026] MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval☆14Sep 18, 2025Updated 8 months ago
- ☆36Aug 25, 2025Updated 8 months ago
- SceneCompleter: Dense 3D Scene Completion for Generative Novel View Synthesis☆36Jun 13, 2025Updated 11 months ago
- Streaming Video Instruction Tuning☆74Feb 25, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Fully Open Framework for Democratized Multimodal Training☆830Updated this week
- LLaVA-Next for STVG☆19Dec 5, 2025Updated 5 months ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆54Jun 12, 2025Updated 11 months ago
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆60Sep 12, 2025Updated 8 months ago
- WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning☆36Jun 10, 2025Updated 11 months ago
- ☆47Jun 24, 2025Updated 10 months ago
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆161Jul 22, 2025Updated 9 months ago
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆56May 25, 2025Updated 11 months ago
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆84Jul 4, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆25Aug 9, 2025Updated 9 months ago
- Quick Long Video Understanding [TMLR2025]☆78Oct 27, 2025Updated 6 months ago
- PyTorch implementation of NEPA☆333Feb 9, 2026Updated 3 months ago
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆37May 9, 2026Updated last week
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation☆36Feb 28, 2026Updated 2 months ago
- Multimodal RewardBench☆68Feb 21, 2025Updated last year
- ☆55Apr 13, 2026Updated last month
- [CVPR 2026] An accurate and dense-annotated synthetic dataset for training SOTA detectors / segmentors / Grounding-VLMs.☆46Feb 23, 2026Updated 2 months ago
- ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation. AAAI, 2025☆15Aug 25, 2025Updated 8 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆184May 1, 2026Updated 2 weeks ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆18Nov 4, 2025Updated 6 months ago
- ☆18Oct 28, 2025Updated 6 months ago
- [CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs☆132Apr 27, 2026Updated 3 weeks ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 8 months ago
- ☆23Mar 17, 2026Updated 2 months ago
- TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics☆65Mar 6, 2026Updated 2 months ago