[ICCV 2025] Implementation of the paper "Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs"
☆74Oct 25, 2025Updated 6 months ago
Alternatives and similar repositories for q-frame
Users that are interested in q-frame are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [SIGCOMM 2023] PacketGame: Multi-Stream Packet Gating for Concurrent Video Inference at Scale☆15Jul 1, 2023Updated 2 years ago
- [ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videos☆27Aug 8, 2025Updated 9 months ago
- Source Code for Captionomaly: A Deep Learning Toolbox for Anomaly Captioning in Surveillance Videos☆13Jun 26, 2023Updated 2 years ago
- [CVPR 2026 Highlight] VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding☆116Apr 17, 2026Updated 3 weeks ago
- Official implementation of "TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization" (Findings of ACL …☆21Jul 25, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A Track-Wise Ensemble Event Independent Network for 3D Polyphonic Sound Event Localization and Detection☆22Nov 14, 2024Updated last year
- [ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding☆76Jun 26, 2025Updated 10 months ago
- ☆14Oct 10, 2024Updated last year
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆33Dec 22, 2025Updated 4 months ago
- ☆23Jan 8, 2024Updated 2 years ago
- ☆31Nov 1, 2023Updated 2 years ago
- [AAAI'26] Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augm…☆11Dec 5, 2025Updated 5 months ago
- [ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring☆25Aug 8, 2025Updated 9 months ago
- Inception-I3D, Non Local finetune, hmdb51_flow☆15Oct 15, 2019Updated 6 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆15Sep 28, 2023Updated 2 years ago
- OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models☆123Apr 25, 2025Updated last year
- ☆10Aug 1, 2021Updated 4 years ago
- Plato is a system for viewport adaptation based bitrate adaptive VR video streaming.☆15May 1, 2018Updated 8 years ago
- An implementation of MSSRM method☆10Mar 23, 2023Updated 3 years ago
- ☆11Jan 18, 2024Updated 2 years ago
- ☆29Feb 18, 2022Updated 4 years ago
- [CVPR 2023] Better “CMOS” Produces Clearer Images: Learning Space-Variant Blur Estimation for Blind Image Super-Resolution☆10Mar 19, 2024Updated 2 years ago
- Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection☆27Aug 22, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- We introduce CausalVQA, a benchmark dataset for video question answering (VQA) composed of question-answer pairs that probe models’ under…☆58Aug 18, 2025Updated 8 months ago
- [ICML'25 Spotlight] Catch Your Emotion: Sharpening Emotion Perception in Multimodal Large Language Models☆52Jan 21, 2026Updated 3 months ago
- [MICCAI 2022] Toward Clinically Assisted Colorectal Polyp Recognition via Structured Cross-modal Representation Consistency☆12Nov 8, 2024Updated last year
- LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [ICRA 2026]☆187Mar 12, 2026Updated last month
- Open-source audio embedding models, submitted to the HEAR 2021 challenge☆11Feb 15, 2026Updated 2 months ago
- Multi-Object Tracker for the H.264 and MPEG-4 Compressed Domain.☆23Jul 6, 2023Updated 2 years ago
- Transferring Genshin PVs into a freehand style with Diffusion Model.☆10Jun 5, 2024Updated last year
- [TPAMI 2023] Object Affinity Learning: Towards Annotation-free Instance Segmentation☆14Sep 14, 2023Updated 2 years ago
- The official repo of the paper titled DeH4R: A Decoupled and Hybrid Method for Road Network Graph Extraction.☆23Apr 10, 2026Updated 3 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆18Apr 10, 2025Updated last year
- [ACL 2025] ⚖️ Temporally-aware MLLM for Biomedical Radiology Analysis and Report Generation. Flexible toolkit with MLLM backbone support,…☆29Mar 18, 2026Updated last month
- Panoramic Out-of-Distribution Segmentation☆15Dec 21, 2025Updated 4 months ago
- [AAAI 2026] SIFThinker: Spatially-Aware Image Focus for Visual Reasoning☆22Dec 2, 2025Updated 5 months ago
- ☆18Oct 22, 2024Updated last year
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆101Jul 15, 2024Updated last year
- A simple command line tool to calculate WER for ASR.☆14Oct 14, 2024Updated last year