quanshr / DMoERMLinks
[ACL2024 Findings]DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
☆18Updated last year
Alternatives and similar repositories for DMoERM
Users that are interested in DMoERM are comparing it to the libraries listed below
Sorting:
- [AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)☆20Updated 9 months ago
- Codebase for Instruction Following without Instruction Tuning☆35Updated last year
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆55Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆37Updated last year
- ☆18Updated 2 months ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆44Updated 5 months ago
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)☆29Updated last month
- ☆22Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Updated last year
- Aioli: A unified optimization framework for language model data mixing☆27Updated 8 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆33Updated last year
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated last year
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models