feizc / DeeCap
Dynamic Early Exit for Image Captioning
β17Updated 2 years ago
Alternatives and similar repositories for DeeCap:
Users that are interested in DeeCap are comparing it to the libraries listed below
- Official Code for "Knowing what it is: Semantic-enhanced Dual Attention Transformer" (TMM2022)β19Updated 2 years ago
- π Official pytorch implementation of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS)β52Updated last year
- [CVPR 2022] This repository is for the paper ``DIFNet: Boosting Visual Information Flow for Image Captioning'' .β20Updated 2 years ago
- [ECCV'22 Poster] Explicit Image Caption Editingβ22Updated 2 years ago
- Lightweight Transformer for Multi-modal Tasksβ15Updated 2 years ago
- [arXiv] Cross-Modal Adapter for Text-Video Retrievalβ55Updated 2 years ago
- [ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"β66Updated 3 years ago
- Microsoft COCO Caption Evaluation Tool - Python 3β33Updated 5 years ago
- β44Updated 2 years ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"β53Updated last year
- Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)β30Updated last year
- Implementation of our IJCAI2022 oral paper, ER-SAN: Enhanced-Adaptive Relation Self-Attention Network for Image Captioning.β22Updated last year
- Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"β33Updated 2 years ago
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)β41Updated 2 years ago
- This repository contains 2 tools: - A py3 Lib for NLP & image-caption metrics - Code for a two-tailed t-test with paired samples. It wilβ¦β18Updated 4 years ago
- β19Updated 2 years ago
- β22Updated 3 years ago
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learningβ20Updated last year
- [ACL 2021] Learning Relation Alignment for Calibrated Cross-modal Retrievalβ30Updated last year
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"β42Updated 2 years ago
- β36Updated 2 years ago
- Towards a Unified View on Visual Parameter-Efficient Transfer Learningβ26Updated 2 years ago
- β9Updated 2 years ago
- Human-like Controllable Image Captioning with Verb-specific Semantic Roles.β36Updated 3 years ago
- β23Updated 2 years ago
- [CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion mβ¦β60Updated 9 months ago
- β17Updated 2 years ago
- [CVPR2022 Oral] The official code for "TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognitβ¦β18Updated 2 years ago
- Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query (ICCV2021)β20Updated 3 years ago
- β24Updated 2 years ago