aimagelab / PMA-Net
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning. ICCV 2023
☆16Updated 7 months ago
Alternatives and similar repositories for PMA-Net:
Users that are interested in PMA-Net are comparing it to the libraries listed below
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆44Updated 9 months ago
- ☆34Updated last year
- ☆28Updated last year
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆43Updated 5 months ago
- Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. CVPR 2023☆58Updated 3 months ago
- VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)☆33Updated 10 months ago
- [EMNLP 2024] IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning☆11Updated 2 months ago
- [ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"☆66Updated 3 years ago
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆35Updated 5 months ago
- [ECCV 2022] GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval☆16Updated 2 years ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆45Updated 7 months ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆63Updated 6 months ago
- NegCLIP.☆30Updated last year
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆72Updated last year
- This is the official implementation of RGNet: A Unified Retrieval and Grounding Network for Long Videos☆12Updated 5 months ago
- Official PyTorch code of "Grounded Question-Answering in Long Egocentric Videos", accepted by CVPR 2024.☆56Updated 4 months ago
- Composed Video Retrieval☆49Updated 8 months ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆24Updated 2 months ago
- Source code of our CVPR2024 paper TeachCLIP for Text-to-Video Retrieval☆26Updated 2 weeks ago
- Video Graph Transformer for Video Question Answering (ECCV'22)☆46Updated last year
- [ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval☆33Updated 5 months ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆74Updated 9 months ago
- This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World"…☆46Updated 10 months ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆31Updated 10 months ago
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval"☆16Updated 4 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆66Updated 3 months ago
- Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examples☆20Updated 2 months ago
- Implementation for CVPR 2022 paper " Injecting Semantic Concepts into End-to-End Image Captionin".☆42Updated 2 years ago
- Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for …☆60Updated 2 years ago
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆48Updated 7 months ago