[R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"
☆127Oct 27, 2025Updated 5 months ago
Alternatives and similar repositories for RLMT
Users that are interested in RLMT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A book about Ph.D. student and research career planning☆29Oct 21, 2025Updated 5 months ago
- ☆21Mar 26, 2025Updated last year
- ☆33Oct 22, 2025Updated 5 months ago
- ☆22Oct 22, 2024Updated last year
- code for paper "Accessing higher dimensions for unsupervised word translation"☆22Jun 26, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Your efficient and accurate answer verification system for RL training.☆41Jun 23, 2025Updated 9 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆67Dec 10, 2024Updated last year
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆32Jun 5, 2025Updated 10 months ago
- 使用yolov8自动标注,运用度量学习metric learning 的ReID算法,实现跨镜头人脸追踪☆10May 15, 2024Updated last year
- JudgeLRM: Large Reasoning Models as a Judge☆41Mar 30, 2026Updated last week
- ☆31Feb 10, 2025Updated last year
- ☆19Nov 12, 2024Updated last year
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆147Feb 19, 2025Updated last year
- A simple implementation of ReasonGenRM.☆19Apr 21, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The repository contains code for Adaptive Data Optimization☆34Dec 9, 2024Updated last year
- 斗破苍穹小说的新词发现☆13May 12, 2022Updated 3 years ago
- QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking☆38Jan 20, 2026Updated 2 months ago
- ☆358Jul 29, 2025Updated 8 months ago
- Resources for the Enigmata Project.☆81Aug 13, 2025Updated 7 months ago
- 🏆 The 1st Place Solution for AICity2022 Challenge Track2: Natural Language-Based Vehicle Retrieval.☆12Jul 25, 2022Updated 3 years ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆224Nov 27, 2025Updated 4 months ago
- [ICLR 2025 Oral] Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition☆18Nov 25, 2024Updated last year
- ☆13Aug 15, 2025Updated 7 months ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- 哈工大威海自动评教脚本☆12Feb 4, 2024Updated 2 years ago
- Bayesian scaling laws for in-context learning.☆15Mar 12, 2025Updated last year
- The code used to train and run inference with MMDocIR☆33May 29, 2025Updated 10 months ago
- Evergreen, contamination-free, real-world, domain-specific AI evaluation framework☆133Jan 11, 2026Updated 2 months ago
- A collection of research papers on low-precision training methods☆65May 10, 2025Updated 10 months ago
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆55Jun 13, 2025Updated 9 months ago
- ☆14Oct 19, 2025Updated 5 months ago
- ☆11May 29, 2024Updated last year
- yolov8在hisi3536a推理☆11Dec 15, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [COLM'25] A Controlled Study on Long Context Extension and Generalization in LLMs☆64Mar 9, 2026Updated last month
- Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence☆60Nov 11, 2025Updated 4 months ago
- Code for paper "ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models"☆17Mar 29, 2024Updated 2 years ago
- Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.☆49Sep 15, 2025Updated 6 months ago
- ☆25Oct 9, 2025Updated 6 months ago
- A lightweight script for processing HTML page to markdown format with support for code blocks☆82Apr 14, 2024Updated last year
- Next-Generation AI-Assisted Kernel Engineering for Multi-Chip Systems☆38Updated this week