xiaojieli0903 / FGKVMemPred_video
Official repository of the "Fine-grained Key-Value Memory Enhanced Predictor for Video Representation Learning" (ACM MM 2023)
☆23Updated 6 months ago
Alternatives and similar repositories for FGKVMemPred_video:
Users that are interested in FGKVMemPred_video are comparing it to the libraries listed below
- Official repository of the “Mask Again: Masked Knowledge Distillation for Masked Video Modeling” (ACM MM 2023)☆26Updated 6 months ago
- Official code of "Continuous Knowledge-Preserving Decomposition for Few-Shot Continual Learning"☆20Updated this week
- [ECCV 2024] Official repository of "GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning".☆27Updated last month
- CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for task-aware parameter-efficient fine-tuning(NeurIPS 2024)☆41Updated last week
- Official repository of ”Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning"☆31Updated 5 months ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆78Updated last month
- This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and contin…☆58Updated 5 months ago
- Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering [ACM MM'24]☆9Updated 6 months ago
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating☆87Updated 11 months ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆63Updated 6 months ago
- [ICCV2023] - CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation☆31Updated 3 months ago
- [NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations☆126Updated 9 months ago
- [Preprint] TRACE: Temporal Grounding Video LLM via Casual Event Modeling☆58Updated 2 weeks ago
- ☆37Updated 9 months ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆194Updated 9 months ago
- ☆22Updated 3 months ago
- ☆97Updated last month
- ☆84Updated 3 weeks ago
- ☆14Updated 2 months ago
- ☆25Updated 4 months ago
- ☆9Updated 10 months ago
- This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with H…☆17Updated 8 months ago
- ☆12Updated last year
- Papers about Hallucination in Multi-Modal Large Language Models (MLLMs)☆75Updated 2 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆66Updated 3 months ago
- A Prompted Visual Hallucination Evaluation Dataset, featuring over 100,000 data points and four advanced evaluation modes. The dataset in…☆11Updated last month
- ☆11Updated last year
- Official code for CVPR 2024 paper, "SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models"☆16Updated 9 months ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆140Updated 8 months ago
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆113Updated last year