JerryYLi / valhalla-nmt
Code repository for CVPR 2022 paper "VALHALLA: Visual Hallucination for Machine Translation"
☆27Updated last year
Related projects: ⓘ
- ☆44Updated 3 years ago
- A collection of models for image<->text generation in ACM MM 2021.☆64Updated 2 years ago
- ☆32Updated 11 months ago
- ☆51Updated 2 years ago
- Official Github Repo for the Findings of EMNLP 2021 paper "An animated picture says at least a thousand words: Selecting Gif-based Replie…☆32Updated 2 years ago
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Updated 3 years ago
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Lo…☆38Updated 8 months ago
- Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)☆27Updated 2 years ago
- Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))☆56Updated last year
- ☆31Updated 2 years ago
- [KBS] PCAE: A Framework of Plug-in Conditional Auto-Encoder for Controllable Text Generation PyTorch Implementation☆23Updated last year
- TuRnIng POint Dataset☆46Updated 4 years ago
- Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text☆18Updated 2 years ago
- Official code repository for the EMNLP 2021 paper☆26Updated 2 years ago
- In-the-wild Question Answering☆15Updated last year
- ☆25Updated 2 years ago
- PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)☆32Updated last year
- DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)☆136Updated 9 months ago
- The official code of our paper at EMNLP 2022: Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Mo…☆15Updated last year
- SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)☆73Updated 11 months ago
- Implementation of "Visualize Before You Write: Imagination-Guided Open-Ended Text Generation".☆17Updated last year
- PyTorch code for EMNLP 2020 paper "X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers"☆50Updated 3 years ago
- ☆45Updated last year
- ☆15Updated 3 years ago
- NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings☆52Updated 3 months ago
- Can LLMs generate code-mixed sentences through zero-shot prompting?☆11Updated last year
- Github repository for Plot and Rework: Modeling Storylines for Visual Storytelling (ACL-IJCNLP2021 Findings)☆20Updated 2 years ago
- The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".☆54Updated 2 years ago
- ☆16Updated 2 years ago
- Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue☆13Updated 2 years ago