[R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"
☆128Oct 27, 2025Updated 6 months ago
Alternatives and similar repositories for RLMT
Users that are interested in RLMT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A book about Ph.D. student and research career planning☆29Oct 21, 2025Updated 6 months ago
- ☆35Oct 22, 2025Updated 6 months ago
- ☆22Oct 22, 2024Updated last year
- code for paper "Accessing higher dimensions for unsupervised word translation"☆22Jun 26, 2023Updated 2 years ago
- dancetrack 比赛第二名☆13Jan 29, 2023Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆68Dec 10, 2024Updated last year
- ☆29Oct 2, 2025Updated 6 months ago
- Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages." Paper accepted at Findings of EMNLP 2024☆18Mar 25, 2025Updated last year
- 使用yolov8自动标注,运用度量学习metric learning 的ReID算法,实现跨镜头人脸追踪☆10May 15, 2024Updated last year
- JudgeLRM: Large Reasoning Models as a Judge☆41Apr 7, 2026Updated 3 weeks ago
- ☆31Feb 10, 2025Updated last year
- ☆19Nov 12, 2024Updated last year
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆148Feb 19, 2025Updated last year
- A simple implementation of ReasonGenRM.☆19Apr 21, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆10Dec 16, 2023Updated 2 years ago
- This repo would give multi-task keypoint detect code based yolov8. The landmarks or keypoints with different classes and numbers can be …☆12Feb 28, 2023Updated 3 years ago
- The repository contains code for Adaptive Data Optimization☆36Dec 9, 2024Updated last year
- 斗破苍穹小说的新词发现☆13May 12, 2022Updated 3 years ago
- ☆359Jul 29, 2025Updated 9 months ago
- Simple and scalable tools for data-driven pretraining data selection.☆29Jun 9, 2025Updated 10 months ago
- Resources for the Enigmata Project.☆81Aug 13, 2025Updated 8 months ago
- 🏆 The 1st Place Solution for AICity2022 Challenge Track2: Natural Language-Based Vehicle Retrieval.☆12Jul 25, 2022Updated 3 years ago