Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence
☆319Mar 2, 2026Updated last month
Alternatives and similar repositories for OneVision-Encoder
Users that are interested in OneVision-Encoder are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official repo for the DanQing dataset.☆33Mar 25, 2026Updated 2 weeks ago
- Syphus: Automatic Instruction-Response Generation Pipeline☆14Dec 14, 2023Updated 2 years ago
- V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day☆29Feb 5, 2025Updated last year
- ☆23Jan 12, 2024Updated 2 years ago
- [CVPR 2026] OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe☆156Mar 30, 2026Updated last week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- OVMR: Open-Vocabulary Recognition with Multi-Modal References (CVPR24)☆36Jun 16, 2025Updated 9 months ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated 2 years ago
- ☆22Feb 13, 2026Updated last month
- ☆18Jul 10, 2024Updated last year
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆36Oct 29, 2025Updated 5 months ago
- Official code for MotionBench (CVPR 2025)☆71Mar 3, 2025Updated last year
- [CVPR 2025] EgoLife: Towards Egocentric Life Assistant☆409Mar 19, 2025Updated last year
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆219Oct 12, 2025Updated 5 months ago
- Code for the Molmo2 Vision-Language Model☆487Mar 18, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆25Jan 14, 2026Updated 2 months ago
- ☆24Feb 17, 2026Updated last month
- [CVPR 2026] An official implementation of "Think Visually, Reason Textually: Vision-Language Synergy in ARC"☆40Nov 26, 2025Updated 4 months ago
- [ICLR 2026] pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation☆283Feb 23, 2026Updated last month
- A collection of awesome think with videos papers.☆96Dec 1, 2025Updated 4 months ago
- ☆217Dec 19, 2025Updated 3 months ago
- Offical implementation of CVPR 2026 paper SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving.☆55Mar 30, 2026Updated last week
- Code for FreeTraj, a tuning-free method for trajectory-controllable video generation☆111Sep 19, 2025Updated 6 months ago
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation☆33Feb 28, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Offline implementation of UniREditBench: A Unified Reasoning-based Image Editing Benchmark.☆55Mar 31, 2026Updated last week
- pytorch implementation of "Efficiently Reconstructing Dynamic Scenes One 🎯 D4RT at a Time"☆52Jan 27, 2026Updated 2 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated last year
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆20Dec 14, 2025Updated 3 months ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆143Aug 21, 2025Updated 7 months ago
- Training Autoregressive Image Generation models via Reinforcement Learning☆51Nov 26, 2025Updated 4 months ago
- Vision Large Language Models trained on M3IT instruction tuning dataset☆17Aug 16, 2023Updated 2 years ago
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams☆76Mar 15, 2026Updated 3 weeks ago
- Agent-RRM: Exploring Reasoning Reward Model for Agents☆58Mar 17, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆56Updated this week
- MLP version of SuperGaussians.☆16Mar 31, 2026Updated last week
- Code repository for "DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers"☆84Oct 28, 2025Updated 5 months ago
- [ICML 2025] Streamline Without Sacrifice - Squeeze out Computation Redundancy in LMM☆20May 22, 2025Updated 10 months ago
- We introduce BabyVision, a benchmark revealing the infancy of AI vision.☆205Jan 13, 2026Updated 2 months ago
- [CVPR 2026] LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling☆215Mar 27, 2026Updated 2 weeks ago
- Unlocking Iterative Reasoning for Any Image Editor☆105Jan 18, 2026Updated 2 months ago