InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
☆110Apr 20, 2026Updated 2 months ago
Alternatives and similar repositories for InfiniteVL
Users that are interested in InfiniteVL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding☆77Jun 26, 2025Updated last year
- ☆62May 13, 2025Updated last year
- [ECCV 2026] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models☆152Dec 25, 2025Updated 6 months ago
- Official PyTorch implementation Source code for Weakly Supervised Video Scene Graph Generation via Natural Language Supervision, accepted…☆24Jun 13, 2025Updated last year
- ASID-Caption: Attribute-Structured and Quality-Verified Audiovisual Instruction Dataset and Training Pipeline for Fine-Grained Video Unde…☆67Mar 3, 2026Updated 3 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆20Oct 12, 2025Updated 8 months ago
- Offical implementation of "Re-Aligning Language to Visual Objects with an Agentic Workflow"☆34Apr 20, 2025Updated last year
- [NeurIPS 2025] | DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data☆50Dec 12, 2025Updated 6 months ago
- ☆61Jul 9, 2024Updated last year
- The official repository of the first version of ACE-Brain foundation model.☆80Mar 13, 2026Updated 3 months ago
- The training codes of Jasper-Token-Compression-600M☆20Nov 19, 2025Updated 7 months ago
- [ACL '26 Findings] V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in MLLMs☆27Apr 28, 2026Updated 2 months ago
- Official repository of the paper "High-Quality Mask Tuning Matters for Open-Vocabulary Segmentation"☆48Mar 25, 2025Updated last year
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official code of "ViTGaze: Gaze Following with Interaction Features in Vision Transformers"☆62Mar 3, 2025Updated last year
- Official implementation of T-PAMI25 paper "M²Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes"☆121Jun 17, 2025Updated last year
- Code of the Grounded MUIE model, REAMO☆11Dec 3, 2024Updated last year
- [ECCV 2024] OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models☆50Jan 8, 2025Updated last year
- ☆35Jun 3, 2025Updated last year
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆18Apr 2, 2025Updated last year
- ☆33Dec 31, 2025Updated 6 months ago
- [NeurIPS 2025] RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning☆259Apr 17, 2026Updated 2 months ago
- Official Implementation for ACM MM2024 paper "VrdONE: One-stage Video Visual Relation Detection".☆12Nov 13, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [ICLR 2026] The official implementation of the paper “Anchored Supervised Fine-Tuning”☆45Jun 19, 2026Updated last week
- The code implementation for TTCS: Test-Time Curriculum Synthesis for Self-Evolving.☆50Apr 22, 2026Updated 2 months ago
- ☆55Sep 21, 2025Updated 9 months ago
- [CVPR 2025] Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"☆135Oct 23, 2025Updated 8 months ago
- [NeurIPS 2023] CircuitFormer: Circuit as Set of Points☆38Nov 22, 2023Updated 2 years ago
- Internal utility libraries for Pkl☆17Jun 25, 2026Updated last week
- Baidu Qianfan Deep Research☆35Jun 8, 2026Updated 3 weeks ago
- [CVPR 2025] iSegMan: Interactive Segment-and-Manipulate 3D Gaussians 🔥🔥🔥☆23Mar 12, 2025Updated last year
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆38Jan 14, 2026Updated 5 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Revisiting End-to-End Speech-to-Text Translation From Scratch☆13Feb 21, 2023Updated 3 years ago
- Official implementation of "PyVision-RL: Forging Open Agentic Vision Models via RL."☆69Feb 25, 2026Updated 4 months ago
- DEAL : Difficulty-awarE Active Learning for Semantic Segmentaion☆21Nov 12, 2021Updated 4 years ago
- Open-source navigation stack for GENISOM-AI robots. Enables intelligent SLAM, path planning and autonomous movement. Free community versi…☆51Apr 23, 2026Updated 2 months ago
- ☆144May 21, 2026Updated last month
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆22Mar 23, 2026Updated 3 months ago
- [CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models☆93Apr 20, 2026Updated 2 months ago