Aman-4-Real / MMTG
[ACM MM 2022]: Multi-Modal Experience Inspired AI Creation
☆19Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for MMTG
- ACL'2024 (Findings): TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆48Updated last year
- Paper, dataset and code list for multimodal dialogue.☆19Updated 2 months ago
- Data for evaluating GPT-4V☆11Updated last year
- ☆17Updated 7 months ago
- This repo contains codes and instructions for baselines in the VLUE benchmark.☆41Updated 2 years ago
- DSTC10 Track1 - MOD: Internet Meme Incorporated Open-domain Dialog☆49Updated last year
- Official repository for the A-OKVQA dataset☆63Updated 6 months ago
- NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings☆53Updated 4 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆25Updated 2 weeks ago
- ☆17Updated 3 months ago
- PyTorch implementation for ACL 2021 paper "Maria: A Visual Experience Powered Conversational Agent".☆25Updated 3 years ago
- Code for our EMNLP-2022 paper: "Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning"☆12Updated last year
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆72Updated 8 months ago
- Attaching human-like eyes to the large language model. The codes of IEEE TMM paper "LMEye: An Interactive Perception Network for Large La…☆48Updated 3 months ago
- This repository contains code to evaluate various multimodal large language models using different instructions across multiple multimoda…☆24Updated 6 months ago
- ☆32Updated last year
- This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness …☆19Updated last year
- CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training☆34Updated 3 years ago
- Vision Large Language Models trained on M3IT instruction tuning dataset☆17Updated last year
- Danmuku dataset☆10Updated last year
- Visual and Embodied Concepts evaluation benchmark☆21Updated last year
- ☆100Updated 2 years ago
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆121Updated last year
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆62Updated 9 months ago
- A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models☆32Updated 4 months ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆16Updated 5 months ago
- Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”☆29Updated last year
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆26Updated 4 months ago
- ☆22Updated 3 months ago
- Controllable mage captioning model with unsupervised modes☆21Updated last year