[EMNLP'25] Code for paper "MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning"
☆66Apr 15, 2025Updated 10 months ago
Alternatives and similar repositories for MT-R1-Zero
Users that are interested in MT-R1-Zero are comparing it to the libraries listed below
Sorting:
- [ACL'25] Repo for paper "M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation"☆21Feb 19, 2025Updated last year
- [EMNLP 2025 Industry] Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning☆36Oct 22, 2025Updated 4 months ago
- ☆46Jun 11, 2025Updated 8 months ago
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 8 months ago
- RePo: Language Models with Context Re-Positioning☆71Dec 24, 2025Updated 2 months ago
- ☆46Sep 27, 2025Updated 5 months ago
- CS194-196 Course Project☆14Feb 20, 2025Updated last year
- Deep Reasoning Translation (DRT) Project☆240Sep 1, 2025Updated 6 months ago
- MegaRAG: Multimodal Graph-based RAG☆36Sep 16, 2025Updated 5 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 11 months ago
- ☆25Jun 18, 2025Updated 8 months ago
- DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails☆31Feb 26, 2025Updated last year
- [ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation☆28Feb 25, 2025Updated last year
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated last year
- ☆25Jun 10, 2025Updated 8 months ago
- The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"☆19Jan 25, 2025Updated last year
- ☆18Jul 13, 2025Updated 7 months ago
- Chinese-native image generation while compatible with SD eco-system, 1st-gen, AAAI2025☆13Jun 25, 2024Updated last year
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆20Feb 26, 2025Updated last year
- [CVPR2025] VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding☆24Mar 24, 2025Updated 11 months ago
- implementation of dualformer☆24Mar 1, 2025Updated last year
- [ICML 2025] Official code of "AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization"☆30Jan 10, 2026Updated last month
- ☆33Jul 15, 2025Updated 7 months ago
- ☆21Aug 30, 2025Updated 6 months ago
- ☆17Aug 1, 2025Updated 7 months ago
- Test-time Scaling for VAR models☆31Sep 19, 2025Updated 5 months ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆21Apr 2, 2024Updated last year
- This repository is the official implementation of the TRAC optimizer in Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement …☆33May 2, 2025Updated 10 months ago
- Finetuning Generative Large Language Models with Discrimination Instructions for Knowledge Graph Completion, ISWC 2024☆33May 17, 2025Updated 9 months ago
- Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments☆48Jan 8, 2026Updated last month
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆69Dec 8, 2025Updated 2 months ago
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆46Jul 17, 2025Updated 7 months ago
- Scaling In-context Learning from Few-shot to 1,024-shot on Tabular ML☆59Dec 12, 2025Updated 2 months ago
- ☆47Oct 2, 2025Updated 5 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆35Jul 16, 2025Updated 7 months ago
- ☆28May 24, 2025Updated 9 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Jun 1, 2025Updated 9 months ago