xiaojieli0903 / FGKVMemPred_video
Official repository of the "Fine-grained Key-Value Memory Enhanced Predictor for Video Representation Learning" (ACM MM 2023)
☆21Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for FGKVMemPred_video
- Official repository of the “Mask Again: Masked Knowledge Distillation for Masked Video Modeling” (ACM MM 2023)☆24Updated 3 months ago
- [ECCV 2024] Official repository of "GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning".☆23Updated this week
- CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for task-aware parameter-efficient fine-tuning(NeurIPS 2024)☆33Updated last week
- ☆13Updated last month
- ☆11Updated 10 months ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆58Updated 4 months ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆68Updated 5 months ago
- [NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations☆121Updated 7 months ago
- This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our em…☆43Updated last month
- [ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding☆14Updated 2 weeks ago
- This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with H…☆17Updated 5 months ago
- ☆35Updated 7 months ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆107Updated last year
- [ICCV2023] - CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation☆28Updated last month
- ☆64Updated 2 weeks ago
- CPL: Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning☆57Updated 7 months ago
- Official github repo for ICCV2023 paper 'Multi-event Video-Text Retrieval'☆18Updated 8 months ago
- Source code of our CVPR2024 paper TeachCLIP for Text-to-Video Retrieval☆17Updated last week
- HallE-Control: Controlling Object Hallucination in LMMs☆28Updated 7 months ago
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆34Updated 7 months ago
- Official repository of ”Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning"☆20Updated 2 months ago
- Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal …☆27Updated last week
- A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability☆16Updated this week
- [NeurIPS'22] This is an official implementation for "Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning".☆172Updated last year
- [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models☆15Updated 3 months ago
- Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)☆58Updated 8 months ago
- ☆33Updated 10 months ago
- [Preprint] TRACE: Temporal Grounding Video LLM via Casual Event Modeling☆36Updated this week
- This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and contin…☆46Updated 3 months ago