SLIT-AI / ADPALinks
[ICLR2025 Spotlight] Advantage-Guided Distillation for Preference Alignment in Small Language Models
☆19Updated 3 months ago
Alternatives and similar repositories for ADPA
Users that are interested in ADPA are comparing it to the libraries listed below
Sorting:
- ☆19Updated 3 weeks ago
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆34Updated 2 months ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆45Updated 7 months ago
- ☆23Updated 3 months ago
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆33Updated last year
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated 11 months ago
- ☆27Updated last year
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆36Updated last week
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆22Updated 8 months ago
- What Makes a Reward Model a Good Teacher? An Optimization Perspective☆31Updated last month
- EMPO, A Fully Unsupervised RLVR Method☆30Updated this week
- Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning☆31Updated 6 months ago
- [ICML 2023] "Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?" by Ruisi Cai, Zhenyu Zhang, Zhangyang Wang☆16Updated 2 years ago
- ☆18Updated 9 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆34Updated last month
- ☆44Updated 2 years ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆45Updated 7 months ago
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆30Updated 2 years ago
- ☆15Updated 9 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆25Updated 6 months ago
- [NeurIPS 2023] Official repository for "Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models"☆12Updated 11 months ago
- Mosaic IT: Enhancing Instruction Tuning with Data Mosaics☆18Updated 3 months ago
- Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)☆35Updated last year
- Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"☆24Updated last year
- ☆16Updated last month
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 7 months ago
- ☆12Updated 4 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆85Updated 7 months ago
- A Sober Look at Language Model Reasoning☆63Updated last week
- ☆27Updated last year