MightXiong / FedMITLinks
☆11Updated 5 months ago
Alternatives and similar repositories for FedMIT
Users that are interested in FedMIT are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations☆139Updated last year
- Code for paper "LLMs Can Evolve Continually on Modality for X-Modal Reasoning" NeurIPS2024☆37Updated 8 months ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆79Updated last year
- [ICCV2023] - CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation☆36Updated 10 months ago
- This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and contin…☆76Updated last year
- Paper Reading of IMCC groups.☆17Updated 2 months ago
- [CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding☆150Updated last year
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆219Updated 2 weeks ago
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆42Updated last year
- ☆94Updated last year
- [ICCV 2023] Prompt-aligned Gradient for Prompt Tuning☆166Updated 2 years ago
- [ICCV 2023] Code for "Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement"☆149Updated last year
- Instruction Tuning in Continual Learning paradigm☆58Updated 7 months ago
- ☆349Updated last year
- Official repository of the "Fine-grained Key-Value Memory Enhanced Predictor for Video Representation Learning" (ACM MM 2023)☆23Updated last year
- ☆13Updated 2 years ago
- [NeurIPS'22] This is an official implementation for "Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning".☆185Updated last year
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆285Updated last year
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆45Updated 5 months ago
- Official github repo for ICCV2023 paper 'Multi-event Video-Text Retrieval'☆18Updated last year
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆142Updated last month
- ☆189Updated last year
- [TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.☆130Updated 3 weeks ago
- Collection of Composed Image Retrieval (CIR) papers.☆254Updated 2 weeks ago
- [NeurIPS2023] Exploring Diverse In-Context Configurations for Image Captioning☆41Updated 9 months ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆96Updated 9 months ago
- Latest open-source "Thinking with images" (O3/O4-mini) papers, covering training-free, SFT-based, and RL-enhanced methods for "fine-grain…☆87Updated 2 weeks ago
- VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)☆40Updated last year
- [CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge☆150Updated this week
- ☆138Updated 6 months ago