OpenCausaLab / CELLOLinks
☆21Updated 7 months ago
Alternatives and similar repositories for CELLO
Users that are interested in CELLO are comparing it to the libraries listed below
Sorting:
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Updated 2 months ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆16Updated 3 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 7 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 3 months ago
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆13Updated this week
- Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆15Updated 2 months ago
- ☆12Updated 3 weeks ago
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆36Updated 2 weeks ago
- [TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"☆19Updated last year
- ☆54Updated last year
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆44Updated 2 weeks ago
- Official Code for ACL 2023 Outstanding Paper: World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Languag…☆32Updated last year
- [ICML 2024] Language Models Represent Beliefs of Self and Others☆32Updated 8 months ago
- [ICLR 2025] Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality☆31Updated last month
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆34Updated 2 months ago
- ☆16Updated 4 months ago
- Github repository for "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas" (ICML 2025)☆30Updated last month
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆50Updated 7 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆36Updated last week
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆68Updated last year
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding☆32Updated last month
- ☆18Updated 9 months ago
- ☆19Updated 3 weeks ago
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆58Updated 2 months ago
- ☆31Updated last year
- Mosaic IT: Enhancing Instruction Tuning with Data Mosaics☆18Updated 3 months ago
- ☆16Updated 4 months ago
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆15Updated 10 months ago
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual C…☆15Updated last year
- JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model☆11Updated last week