yangbang18 / ZeroNLG
(TPAMI'2024) ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation
☆20Updated 6 months ago
Alternatives and similar repositories for ZeroNLG:
Users that are interested in ZeroNLG are comparing it to the libraries listed below
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆26Updated 3 weeks ago
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆34Updated 2 months ago
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆32Updated last year
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆11Updated 7 months ago
- 🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)☆63Updated last year
- AAAI 2024: DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning☆15Updated 9 months ago
- Preference Learning for LLaVA☆37Updated 3 months ago
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- Code for our Paper "All in an Aggregated Image for In-Image Learning"☆29Updated 10 months ago
- ☆16Updated 6 months ago
- NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings☆55Updated 8 months ago
- ☆41Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆43Updated 8 months ago
- Data for evaluating GPT-4V☆11Updated last year
- A curated list of vision-and-language pre-training (VLP). :-)☆57Updated 2 years ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆38Updated last year
- Code for paper: Unified Text-to-Image Generation and Retrieval☆13Updated 7 months ago
- ☆22Updated 6 months ago
- Nearest Neighbor Normalization (EMNLP 2024)☆18Updated 3 months ago
- ☆19Updated last year
- Code and data for "Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue" (ACL 2024)☆22Updated 6 months ago
- Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…☆12Updated 7 months ago
- This repository contains code to evaluate various multimodal large language models using different instructions across multiple multimoda…☆26Updated 9 months ago
- ☆22Updated 4 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆42Updated 3 months ago
- Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…☆31Updated 2 months ago
- Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text☆24Updated 2 years ago
- ☆24Updated last year