[CVPR2025] Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models
☆19Apr 30, 2025Updated 10 months ago
Alternatives and similar repositories for HICom
Users that are interested in HICom are comparing it to the libraries listed below
Sorting:
- What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness☆26May 16, 2025Updated 9 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model☆37Jan 8, 2025Updated last year
- [ICLR 2025] Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate☆17Apr 22, 2025Updated 10 months ago
- ☆12Nov 26, 2024Updated last year
- Official Implementation of DART (DART: Diffusion-Inspired Speculative Decoding for Fast LLM Inference).☆44Feb 8, 2026Updated 3 weeks ago
- ECG analysis to classify anterior myocardial infarction cases.☆10May 17, 2017Updated 8 years ago
- [ICCV2025] The official code of "DreamRelation: Relation-Centric Video Customization"☆27Feb 4, 2026Updated last month
- A Google Chrome Extension that replaces the official New Tab page with a beautiful to-do list.☆12Mar 7, 2018Updated 7 years ago
- [IJCAI-2024] The official code of Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition☆10Aug 10, 2025Updated 6 months ago
- ☆10Nov 17, 2022Updated 3 years ago
- Disable YubiKey output on MacOS without a modifier key pressed☆10Aug 10, 2022Updated 3 years ago
- [NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"☆26Nov 21, 2025Updated 3 months ago
- [CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"☆24Jun 8, 2025Updated 8 months ago
- ☆38Feb 20, 2026Updated last week
- Information fusion for real-time national air transportation system prognostics under uncertainty.☆13May 18, 2022Updated 3 years ago
- 《软件项目组织管理》笔记☆10Jun 18, 2024Updated last year
- A few TensorFlow techniques I'm saving for future reference.☆13Oct 4, 2016Updated 9 years ago
- ☆32Nov 16, 2025Updated 3 months ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Mar 6, 2023Updated 2 years ago
- [Neurocomputing] EmoVerse: Enhancing Multimodal Large Language Models for Affective Computing via Multitask Learning☆16Jul 6, 2025Updated 8 months ago
- Python scripts to download course videos off CDEEP☆12Oct 20, 2015Updated 10 years ago
- ☆11Jul 31, 2022Updated 3 years ago
- The official implement of "Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models"☆17Mar 24, 2025Updated 11 months ago
- Pytorch implementation of Yolo V3☆11Aug 30, 2018Updated 7 years ago
- Code for paper: Unraveling the Shift of Visual Information Flow in MLLMs: From Phased Interaction to Efficient Inference☆13Jun 7, 2025Updated 8 months ago
- ☆13Oct 25, 2024Updated last year
- ☆13Apr 23, 2025Updated 10 months ago
- Visual Reaction: Learning to Play Catch with Your Drone☆13Jul 23, 2023Updated 2 years ago
- [CVPR25] IAR☆17Jun 13, 2025Updated 8 months ago
- Code release for VTW (AAAI 2025 Oral)☆64Nov 4, 2025Updated 4 months ago
- ☆66Jan 23, 2026Updated last month
- Code for Retrieval-Augmented Perception (ICML 2025)☆68Aug 10, 2025Updated 6 months ago
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆61Jul 16, 2024Updated last year
- ORES: Open-vocabulary Responsible Visual Synthesis☆14Dec 12, 2023Updated 2 years ago
- ☆16Feb 24, 2025Updated last year
- ☆16Sep 25, 2025Updated 5 months ago
- ☆11Jan 14, 2021Updated 5 years ago
- Collection of evaluation code for natural language generation.☆12Jan 6, 2021Updated 5 years ago