InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
☆108Apr 20, 2026Updated last month
Alternatives and similar repositories for InfiniteVL
Users that are interested in InfiniteVL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding☆77Jun 26, 2025Updated 10 months ago
- ☆61May 13, 2025Updated last year
- [ArXiv 2025] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models☆143Dec 25, 2025Updated 4 months ago
- Official PyTorch implementation Source code for Weakly Supervised Video Scene Graph Generation via Natural Language Supervision, accepted…☆24Jun 13, 2025Updated 11 months ago
- ASID-Caption: Attribute-Structured and Quality-Verified Audiovisual Instruction Dataset and Training Pipeline for Fine-Grained Video Unde…☆64Mar 3, 2026Updated 2 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆20Oct 12, 2025Updated 7 months ago
- The official repository of the first version of ACE-Brain foundation model.☆76Mar 13, 2026Updated 2 months ago
- [NeurIPS 2025] | DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data☆49Dec 12, 2025Updated 5 months ago
- ☆60Jul 9, 2024Updated last year
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.☆91Dec 24, 2025Updated 4 months ago
- [ACL '26 Findings] V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in MLLMs☆27Apr 28, 2026Updated 3 weeks ago
- Official repository of the paper "High-Quality Mask Tuning Matters for Open-Vocabulary Segmentation"☆46Mar 25, 2025Updated last year
- Official code of "ViTGaze: Gaze Following with Interaction Features in Vision Transformers"☆63Mar 3, 2025Updated last year
- Code of the Grounded MUIE model, REAMO☆10Dec 3, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆35Jun 3, 2025Updated 11 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated last year
- Official Implementation for ACM MM2024 paper "VrdONE: One-stage Video Visual Relation Detection".☆12Nov 13, 2024Updated last year
- [CVPR 2025] GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding☆213Jan 5, 2026Updated 4 months ago
- A framework aiming to bridge fast robot prototyping, predefined motion primitives, heterogeneous teleoperation, data collection, and flex…☆26Apr 4, 2026Updated last month
- [ICLR 2026] The official implementation of the paper “Anchored Supervised Fine-Tuning”☆39May 8, 2026Updated 2 weeks ago
- Official code for the paper: "A Closer Look at Self-training for Zero-Label Semantic Segmentation" https://arxiv.org/abs/2104.11692☆25Aug 22, 2021Updated 4 years ago
- ☆55Sep 21, 2025Updated 8 months ago
- [CVPR 2025] Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"☆132Oct 23, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICML 2026] Transform Trained Transformer for Accelerating Native 4K Video Generation☆39Dec 16, 2025Updated 5 months ago
- Internal utility libraries for Pkl☆16May 14, 2026Updated last week
- ☆17Dec 13, 2023Updated 2 years ago
- [CVPR 2025] iSegMan: Interactive Segment-and-Manipulate 3D Gaussians 🔥🔥🔥☆23Mar 12, 2025Updated last year
- [ACMMM 2025] ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies☆22Jun 20, 2025Updated 11 months ago
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆35Jan 14, 2026Updated 4 months ago
- ☆14Aug 1, 2025Updated 9 months ago
- Multilingual and Multiculture Benchmark and LLM☆36Updated this week
- [CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models☆81Apr 20, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official implementation of "PyVision-RL: Forging Open Agentic Vision Models via RL."☆66Feb 25, 2026Updated 2 months ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆36Oct 3, 2025Updated 7 months ago
- [ICML 2026] InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem☆22Apr 7, 2026Updated last month
- Rethinking the Trust Region in LLM Reinforcement Learning☆54Mar 2, 2026Updated 2 months ago
- Cambrian-S: Towards Spatial Supersensing in Video☆543Apr 3, 2026Updated last month
- ☆22Dec 3, 2025Updated 5 months ago
- ☆37Apr 21, 2026Updated last month