RUCKBReasoning / CodeRM
The code of arXiv paper: "Dynamic Scaling of Unit Tests for Code Reward Modeling"
☆14Updated last month
Alternatives and similar repositories for CodeRM:
Users that are interested in CodeRM are comparing it to the libraries listed below
- ☆15Updated 6 months ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆14Updated 2 months ago
- ☆20Updated 7 months ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated 11 months ago
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆14Updated this week
- Codebase for Instruction Following without Instruction Tuning☆33Updated 4 months ago
- AbstainQA, ACL 2024☆25Updated 4 months ago
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆32Updated 9 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆34Updated last year
- [NAACL 2024] A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models☆32Updated 8 months ago
- ☆13Updated last year
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆11Updated 3 weeks ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆23Updated 4 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆21Updated 2 months ago
- PyTorch implementation of StableMask (ICML'24)☆12Updated 7 months ago
- [ICLR 24 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆17Updated last week
- Public code repo for paper "Aligning LLMs with Individual Preferences via Interaction"☆18Updated 4 months ago
- ☆16Updated 3 months ago
- The official GitHub page for paper "NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional St…☆20Updated 9 months ago
- ☆16Updated 3 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆31Updated last year
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆72Updated 2 weeks ago
- ACL24☆9Updated 8 months ago
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆18Updated this week
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆41Updated last week
- ☆34Updated 3 months ago
- Code for paper: "LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits"☆13Updated 4 months ago
- Official implementation of paper "Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment" (https://arxiv.or…☆21Updated this week