tongxuluo / LeaPLinks
Code, Data and Model for Paper "Learning from Peers in Reasoning Models"
☆22Updated 3 weeks ago
Alternatives and similar repositories for LeaP
Users that are interested in LeaP are comparing it to the libraries listed below
Sorting:
- ☆45Updated last month
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆30Updated 2 weeks ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 3 months ago
- Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping☆41Updated 2 weeks ago
- Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?☆23Updated 2 months ago
- A Survey on the Honesty of Large Language Models☆57Updated 6 months ago
- Official Code and data for ACL 2024 finding, "An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models"☆19Updated 6 months ago
- [ICLR 2025] Released code for paper "Spurious Forgetting in Continual Learning of Language Models"☆46Updated 3 weeks ago
- ☆74Updated last year
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆25Updated last week
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?☆26Updated this week
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated 5 months ago
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆34Updated 2 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆73Updated this week
- Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"☆32Updated 10 months ago
- [EMNLP'2023 Findings] MoqaGPT, for zero-shot multimodal question answering with LLMs☆12Updated 5 months ago
- The official code repository for PRMBench.☆73Updated 3 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆69Updated 3 months ago
- code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning☆16Updated 10 months ago
- Code for ACL 2024 accepted paper titled "SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language …☆35Updated 4 months ago
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆38Updated last year
- EMPO, A Fully Unsupervised RLVR Method☆30Updated this week
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆46Updated 6 months ago
- ☆19Updated last month
- ☆46Updated 7 months ago
- [ICLR 2025] Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality☆31Updated last month
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆19Updated 3 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 7 months ago
- ☆22Updated 11 months ago
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆27Updated 2 weeks ago