md-mohaiminul / BIMBAView external linksLinks
☆29Jul 25, 2025Updated 6 months ago
Alternatives and similar repositories for BIMBA
Users that are interested in BIMBA are comparing it to the libraries listed below
Sorting:
- ☆27Jul 18, 2025Updated 6 months ago
- ☆30Mar 2, 2023Updated 2 years ago
- official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"☆22Apr 23, 2025Updated 9 months ago
- Question-Aware Gaussian Experts for Audio-Visual Question Answering -- Official Pytorch Implementation (CVPR'25, Highlight)☆26Jun 6, 2025Updated 8 months ago
- Module 1 - Autodifferentiation☆21Sep 8, 2024Updated last year
- The official PyTorch implementation of the IEEE/CVF Computer Vision and Pattern Recognition (CVPR) '24 paper PREGO: online mistake detect…☆31Jun 9, 2025Updated 8 months ago
- OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models☆54Feb 1, 2026Updated 2 weeks ago
- ☆29Feb 7, 2024Updated 2 years ago
- This is the official implementation of ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos☆43Nov 5, 2025Updated 3 months ago
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆38Jan 27, 2026Updated 2 weeks ago
- A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.☆74Oct 14, 2024Updated last year
- [ICCV 2025] Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs☆56Feb 2, 2026Updated last week
- Official implementation of paper ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding☆39Mar 16, 2025Updated 10 months ago
- ☆38Jul 24, 2023Updated 2 years ago
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.☆82Dec 24, 2025Updated last month
- ☆37Sep 16, 2024Updated last year
- 🏠[ICME 2023] Low-complexity Deep Video Compression with A Distributed Coding Architecture☆36May 29, 2023Updated 2 years ago
- Official implementation of the paper "LTrack: Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Rep…☆12Jul 26, 2023Updated 2 years ago
- Code for "RADSeg Unleashing Parameter and Compute Efficient Zero-Shot Open-Vocabulary Segmentation Using Agglomerative Models"☆28Jan 27, 2026Updated 2 weeks ago
- Communication Relay by creating a WiFi Mesh Network using ROS, and using that network for Data Telemetry, with Telemetry radios ( Ubiquit…☆11Dec 18, 2018Updated 7 years ago
- Implementation for "StyleGAN-Canvas: Augmenting StyleGAN3 for Real-Time Human-AI Co-Creation"☆11May 24, 2023Updated 2 years ago
- ☆13Jul 3, 2024Updated last year
- ☆17Oct 1, 2021Updated 4 years ago
- Project focused on enhancing the quality of low-fidelity endoscopy images using Generative Adversarial Networks (GANs) implemented in PyT…☆17Jun 5, 2025Updated 8 months ago
- [IPCAI'24 Best Paper] Advancing Surgical VQA with Scene Graph Knowledge☆47May 23, 2025Updated 8 months ago
- Quick Long Video Understanding [TMLR2025]☆75Oct 27, 2025Updated 3 months ago
- ECCV24 "ReMamber: Referring Image Segmentation with Mamba Twister" official repository.☆44Jul 11, 2024Updated last year
- Official Repository for "Learning to Visually Localize Sound Sources from Mixtures without Prior Source Knowledge" (CVPR 2024)☆13Sep 1, 2024Updated last year
- [CVPR 2024] Selective, Interpretable and Motion Consistent Privacy Attribute Obfuscation for Action Recognition☆12Mar 20, 2024Updated last year
- An official codebase for "NormLens: Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Comm…☆10May 9, 2024Updated last year
- Unofficial implementation of SORT, A simple online and real-time tracking algorithm for 2D multiple objects tracking in video sequences, …☆12Jul 1, 2021Updated 4 years ago
- Virtual character locomotion system. See article“Motion Graphs”, Lucas Kovar, 2002☆12Mar 1, 2012Updated 13 years ago
- Computational Neuroscience stuff☆13Aug 12, 2019Updated 6 years ago
- ☆11May 27, 2022Updated 3 years ago
- EgoToM is an egocentric theory-of-mind benchmark built on Ego4D videos, containing multi-choice questions that evaluate multimodal large …☆13Apr 1, 2025Updated 10 months ago
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆13Jan 27, 2025Updated last year
- Reinforcement Training of Robot☆11Dec 1, 2019Updated 6 years ago
- https://avocado-captioner.github.io/☆29Oct 16, 2025Updated 3 months ago
- [ICIP2023] Code for the paper 'Action Anticipation with Goal Consistency'☆12Apr 5, 2024Updated last year