CLCS-SUSTech / FACELinks
☆12Updated 6 months ago
Alternatives and similar repositories for FACE
Users that are interested in FACE are comparing it to the libraries listed below
Sorting:
- Code base for "Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood"☆13Updated 2 months ago
- TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models☆15Updated 5 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆74Updated 6 months ago
- TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos☆42Updated 2 weeks ago
- ☆37Updated 10 months ago
- [NeurIPS 2024] Mitigating Object Hallucination via Concentric Causal Attention☆56Updated 5 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆50Updated 7 months ago
- 🚀 Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆27Updated 2 weeks ago
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Updated last year
- Code for Reducing Hallucinations in Vision-Language Models via Latent Space Steering☆57Updated 6 months ago
- NegCLIP.☆32Updated 2 years ago
- Official Code for ACL 2023 Outstanding Paper: World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Languag…☆32Updated last year
- XL-VLMs: General Repository for eXplainable Large Vision Language Models☆24Updated 2 weeks ago
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆38Updated last year
- [NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training☆25Updated last year
- code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning☆16Updated 10 months ago
- Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)☆31Updated 7 months ago
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆82Updated 5 months ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆86Updated last year
- [NeurIPS 2024] Official Code for the Paper "Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning"☆21Updated 2 months ago
- Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model☆31Updated 5 months ago
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆34Updated 2 months ago
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆25Updated 5 months ago
- ☆21Updated 7 months ago
- Training A Small Emotional Vision Language Model for Visual Art Comprehension☆16Updated 10 months ago
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆51Updated last year
- [WACV 2025] Official Pytorch code for "Background-aware Moment Detection for Video Moment Retrieval"☆14Updated 3 months ago
- [CVPR 2024] How to Configure Good In-Context Sequence for Visual Question Answering☆18Updated last week
- ☆18Updated 10 months ago
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strat…☆78Updated 3 months ago