[R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"
☆129Oct 27, 2025Updated 8 months ago
Alternatives and similar repositories for RLMT
Users that are interested in RLMT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A book about Ph.D. student and research career planning☆29Oct 21, 2025Updated 8 months ago
- ☆25Jun 1, 2026Updated 3 weeks ago
- ☆38May 29, 2026Updated last month
- ☆22Oct 22, 2024Updated last year
- code for paper "Accessing higher dimensions for unsupervised word translation"☆23Jun 26, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Your efficient and accurate answer verification system for RL training.☆42Jun 23, 2025Updated last year
- ☆32Oct 2, 2025Updated 8 months ago
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆32Jun 5, 2025Updated last year
- Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages." Paper accepted at Findings of EMNLP 2024☆18Mar 25, 2025Updated last year
- 使用yolov8自动标注,运用度量学习metric learning 的ReID算法,实现跨镜头人脸追踪☆10May 15, 2024Updated 2 years ago
- ☆33Feb 10, 2025Updated last year
- ☆19Nov 12, 2024Updated last year
- A simple implementation of ReasonGenRM.☆19Apr 21, 2025Updated last year
- Official Code For EMNLP2025 Findings: {DLPO : Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Le…☆10Dec 25, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆153Feb 19, 2025Updated last year
- ☆10Dec 16, 2023Updated 2 years ago
- The repository contains code for Adaptive Data Optimization☆36Dec 9, 2024Updated last year
- 斗破苍穹小说的新词发现☆13May 12, 2022Updated 4 years ago
- QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking☆40Jan 20, 2026Updated 5 months ago
- ☆358Jul 29, 2025Updated 11 months ago
- Resources for the Enigmata Project.☆82Aug 13, 2025Updated 10 months ago
- 🏆 The 1st Place Solution for AICity2022 Challenge Track2: Natural Language-Based Vehicle Retrieval.☆12Jul 25, 2022Updated 3 years ago
- [ICLR 2025 Oral] Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition☆17Nov 25, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆228Nov 27, 2025Updated 7 months ago
- 哈工大威海自动评教脚本☆12Feb 4, 2024Updated 2 years ago
- 使用Qwen3的Embedding和Reranker模型实现查找与精排☆23Jun 22, 2025Updated last year
- Bayesian scaling laws for in-context learning.☆16Mar 12, 2025Updated last year
- The code used to train and run inference with MMDocIR☆34May 29, 2025Updated last year
- Evergreen, contamination-free, real-world, domain-specific AI evaluation framework☆139Jan 11, 2026Updated 5 months ago
- A collection of research papers on low-precision training methods☆69May 10, 2025Updated last year
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆56Jun 13, 2025Updated last year
- ☆11May 29, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆13Feb 17, 2025Updated last year
- Code for paper "ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models"☆17Mar 29, 2024Updated 2 years ago
- Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence☆67Nov 11, 2025Updated 7 months ago
- A lightweight script for processing HTML page to markdown format with support for code blocks☆81Apr 14, 2024Updated 2 years ago
- ☆17Oct 16, 2023Updated 2 years ago
- [ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)☆193Feb 17, 2025Updated last year
- TPLink IPC Control☆20Jul 24, 2024Updated last year