UCSC-VLAA / Sight-Beyond-Text
[TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
☆19Updated last year
Alternatives and similar repositories for Sight-Beyond-Text:
Users that are interested in Sight-Beyond-Text are comparing it to the libraries listed below
- [NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation☆12Updated last year
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆32Updated last year
- ☆22Updated 2 years ago
- [EMNLP'23 Oral] ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue PyTorch Implementation☆12Updated last year
- Official Code for ACL 2023 Outstanding Paper: World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Languag…☆30Updated last year
- ☆31Updated last year
- ☆34Updated last year
- Code for paper: Unified Text-to-Image Generation and Retrieval☆13Updated 7 months ago
- ☆26Updated 2 years ago
- ☆54Updated 10 months ago
- This repo contains code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation"☆11Updated last month
- Source code for the paper "Prefix Language Models are Unified Modal Learners"☆43Updated last year
- Preference Learning for LLaVA☆37Updated 3 months ago
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆35Updated 6 months ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆79Updated 9 months ago
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆44Updated last month
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆35Updated last year
- Official Repository of Personalized Visual Instruct Tuning☆26Updated 3 months ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆43Updated 8 months ago
- ☆27Updated last year
- ☆19Updated last year
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆12Updated 2 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated 3 months ago
- ☆10Updated 3 months ago
- ☆38Updated 3 months ago
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆30Updated last year
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆38Updated last year
- ☆17Updated 7 months ago
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Updated last year