junkangwu / alpha-DPO
☆12Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for alpha-DPO
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆29Updated 2 weeks ago
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆29Updated 6 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆30Updated 3 months ago
- This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting☆14Updated 3 months ago
- Long Context Extension and Generalization in LLMs☆39Updated last month
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆32Updated 10 months ago
- ☆15Updated 3 months ago
- ☆14Updated 8 months ago
- ☆33Updated 3 weeks ago
- Code for paper - On Diversified Preferences of Large Language Model Alignment☆14Updated 3 months ago
- [ATTRIB @ NeurIPS 2024] When Attention Sink Emerges in Language Models: An Empirical View☆27Updated 3 weeks ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆67Updated last month
- Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆37Updated 4 months ago
- The repository for our paper: Neighboring Perturbations of Knowledge Editing on Large Language Models☆14Updated 6 months ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated 8 months ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆44Updated last year
- ☆41Updated last year
- A Closer Look into Mixture-of-Experts in Large Language Models☆39Updated 3 months ago
- [NAACL 2024 Findings] Evaluation suite for the systematic evaluation of instruction selection methods.☆23Updated last year
- Official implementation of paper "General Preference Modeling with Preference Representations for Aligning Language Models" (https://arxi…☆18Updated 2 weeks ago
- Models, data, and codes for the paper: MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models☆16Updated last month
- [SIGIR'24] Generative Retrieval as Multi-Vector Dense Retrieval☆25Updated 3 weeks ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆36Updated 8 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆33Updated 3 months ago
- [ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning☆21Updated last year
- EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue☆32Updated 3 weeks ago
- ☆12Updated 2 months ago
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆14Updated last month
- ☆15Updated 4 months ago
- Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023☆11Updated 10 months ago