fzp0424 / MT-R1-Zero
Code for paper "MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning"
☆26Updated last week
Alternatives and similar repositories for MT-R1-Zero:
Users that are interested in MT-R1-Zero are comparing it to the libraries listed below
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…☆48Updated 10 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆47Updated 4 months ago
- ☆46Updated 10 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated last year
- The code and data for the paper JiuZhang3.0☆43Updated 11 months ago
- ☆98Updated 6 months ago
- Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales☆32Updated last year
- ☆36Updated 7 months ago
- ☆37Updated 2 weeks ago
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆61Updated 6 months ago
- Feeling confused about super alignment? Here is a reading list☆42Updated last year
- ☆29Updated 6 months ago
- [ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)☆129Updated 2 months ago
- Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment☆75Updated 10 months ago
- Official completion of “Training on the Benchmark Is Not All You Need”.☆31Updated 3 months ago
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆76Updated last year
- a-m-team's exploration in large language modeling☆49Updated 3 weeks ago
- ☆45Updated 7 months ago
- code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》☆31Updated last year
- ☆81Updated last year
- Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process☆27Updated 8 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆58Updated 4 months ago
- Towards Systematic Measurement for Long Text Quality☆34Updated 7 months ago
- A collection of instruction data and scripts for machine translation.☆20Updated last year
- MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following☆16Updated 5 months ago
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated 10 months ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆80Updated last year
- Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …☆32Updated 4 months ago
- ☆94Updated last year
- Automatic prompt optimization framework for multi-step agent tasks.☆29Updated 5 months ago