junkangwu / alpha-DPO
☆12Updated last month
Related projects ⓘ
Alternatives and complementary repositories for alpha-DPO
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆31Updated 3 weeks ago
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆30Updated 6 months ago
- This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting☆14Updated 3 months ago
- Long Context Extension and Generalization in LLMs☆39Updated 2 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆30Updated 3 months ago
- ☆15Updated 3 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆33Updated 10 months ago
- ☆13Updated 9 months ago
- Offical code of the paper Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Le…☆68Updated 8 months ago
- Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆38Updated this week
- Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888☆36Updated 5 months ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated 8 months ago
- Code for paper - On Diversified Preferences of Large Language Model Alignment☆14Updated 3 months ago
- Domain-specific preference (DSP) data and customized RM fine-tuning.☆24Updated 8 months ago
- ☆39Updated last month
- A Closer Look into Mixture-of-Experts in Large Language Models☆40Updated 3 months ago
- ☆31Updated 3 weeks ago
- code for EACL2024-main:Generative Dense Retrieval: Memory Can Be a Burden☆19Updated 10 months ago
- ☆26Updated last year
- Directional Preference Alignment☆50Updated last month
- [ATTRIB @ NeurIPS 2024] When Attention Sink Emerges in Language Models: An Empirical View☆29Updated last month
- ☆12Updated 2 months ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs?☆25Updated 5 months ago
- ☆19Updated 2 weeks ago
- The repository for our paper: Neighboring Perturbations of Knowledge Editing on Large Language Models☆15Updated 6 months ago
- Code for "Neural Retrievers are Biased Towards LLM-Generated Content"☆13Updated last month
- The official GitHub page for paper "NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional St…☆19Updated 6 months ago
- The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".☆14Updated 5 months ago
- ☆17Updated 4 months ago