Raibows / CREAM
Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.
☆18Updated last month
Alternatives and similar repositories for CREAM:
Users that are interested in CREAM are comparing it to the libraries listed below
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆55Updated last month
- ☆21Updated 8 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆54Updated 5 months ago
- A Self-Training Framework for Vision-Language Reasoning☆71Updated 2 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆56Updated 3 months ago
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆19Updated last month
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆98Updated 2 weeks ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆24Updated 3 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 5 months ago
- ☆43Updated 5 months ago
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆41Updated 3 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆62Updated last month
- ☆83Updated 2 weeks ago
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆27Updated 2 weeks ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆44Updated 5 months ago
- ☆81Updated 2 months ago
- PyTorch implementation of StableMask (ICML'24)☆12Updated 9 months ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆35Updated 2 weeks ago
- A Survey on the Honesty of Large Language Models☆56Updated 3 months ago
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆60Updated 5 months ago
- M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆55Updated 3 months ago
- Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆44Updated last month
- This repo contains the code for the paper "Understanding and Mitigating Hallucinations in Large Vision-Language Models via Modular Attrib…☆12Updated 2 weeks ago
- ☆70Updated 2 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆33Updated 8 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 3 months ago
- ☆16Updated 2 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆104Updated last week
- ☆25Updated 10 months ago
- Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal …☆46Updated last month