UCSC-VLAA / Sight-Beyond-Text
[TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
☆19Updated last year
Alternatives and similar repositories for Sight-Beyond-Text:
Users that are interested in Sight-Beyond-Text are comparing it to the libraries listed below
- [NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation☆12Updated last year
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆32Updated last year
- [EMNLP'23 Oral] ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue PyTorch Implementation☆12Updated last year
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆38Updated this week
- ☆54Updated last year
- A Comprehensive Benchmark for Robust Multi-image Understanding☆10Updated 6 months ago
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Updated last year
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆27Updated 3 weeks ago
- ☆31Updated last year
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆20Updated last year
- ☆22Updated 2 years ago
- Source code for the paper "Prefix Language Models are Unified Modal Learners"☆43Updated last year
- ☆10Updated 5 months ago
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆32Updated last year
- ☆40Updated 4 months ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆43Updated 9 months ago
- ☆35Updated last year
- ☆27Updated last year
- Preference Learning for LLaVA☆41Updated 4 months ago
- ☆17Updated 8 months ago
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆33Updated last year
- Implementation and dataset for paper "Can MLLMs Perform Text-to-Image In-Context Learning?"☆34Updated 2 weeks ago
- ☆18Updated 8 months ago
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆30Updated last year
- [ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations☆14Updated last year
- Code for paper: Unified Text-to-Image Generation and Retrieval☆14Updated 8 months ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆11Updated 3 months ago
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆35Updated 7 months ago
- 🔥 [ICLR 2025] Official Benchmark Toolkits for "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"☆26Updated last month
- Official Repository of Personalized Visual Instruct Tuning☆28Updated 3 weeks ago