waterhorse1/Natural-language-RL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/waterhorse1/Natural-language-RL)

waterhorse1 / Natural-language-RL

Natural Language Reinforcement Learning

☆101

Alternatives and similar repositories for Natural-language-RL

Users that are interested in Natural-language-RL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Aloriosa / srmt
View on GitHub
The original Shared Recurrent Memory Transformer implementation
☆36Jul 11, 2025Updated last year
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated 2 years ago
uq-project / UQ
View on GitHub
UQ: Assessing Language Models on Unsolved Questions
☆30Aug 26, 2025Updated 11 months ago
sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆65Jan 5, 2026Updated 6 months ago
facebookresearch / multimodal_rewardbench
View on GitHub
Multimodal RewardBench
☆68Feb 21, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ArmelRandy / tree-of-problems
View on GitHub
[EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality
☆20Mar 4, 2025Updated last year
SalesforceAIResearch / LaTRO
View on GitHub
☆127Jun 2, 2026Updated last month
RUCAIBox / RLMEC
View on GitHub
The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"
☆39Jan 12, 2024Updated 2 years ago
aaronserianni / attention-iou
View on GitHub
[CVPR'25] Attention IoU: Examining Biases in CelebA using Attention Maps
☆13Mar 26, 2025Updated last year
LAMDA-NeSy / Self-Backtracking
View on GitHub
☆52Feb 12, 2025Updated last year
aszala / EnvGen
View on GitHub
Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)
☆40Jul 13, 2024Updated 2 years ago
Timothyxxx / NeuralSymbolicPapers
View on GitHub
☆14Aug 18, 2022Updated 3 years ago
psunlpgroup / ReaLMistake
View on GitHub
This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".
☆32Aug 18, 2024Updated last year
sai-prasanna / dreaming_of_many_worlds
View on GitHub
☆25Sep 23, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
kokolerk / TON
View on GitHub
[NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
☆58Sep 29, 2025Updated 10 months ago
YifeiZhou02 / ArCHer
View on GitHub
Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
☆208Apr 17, 2025Updated last year
kavosh8 / Lip
View on GitHub
☆13Jul 9, 2018Updated 8 years ago
tajwarfahim / srt
View on GitHub
Official implementation for the paper "Can Large Reasoning Models Self-Train?"
☆76Jul 9, 2026Updated 2 weeks ago
Simplified-Reasoning / LUFFY
View on GitHub
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆461Mar 20, 2026Updated 4 months ago
sunblaze-ucb / reasoning_ladder
View on GitHub
☆35May 16, 2025Updated last year
LLaMafia / SFT_function_learning
View on GitHub
Explore what LLMs are really leanring over SFT
☆28Mar 30, 2024Updated 2 years ago
google-deepmind / lm_act
View on GitHub
LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations
☆30May 21, 2025Updated last year
huiwy / reflection-on-trees
View on GitHub
☆14May 9, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
HypherX / Evolution-Analysis
View on GitHub
☆25Dec 13, 2024Updated last year
marinero4972 / CyberV
View on GitHub
☆20Jun 10, 2025Updated last year
NineAbyss / S2R
View on GitHub
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
☆77Apr 22, 2025Updated last year
Video-as-Agent / VideoAgent
View on GitHub
Official implementation of "Self-Improving Video Generation"
☆79Apr 25, 2025Updated last year
dangxingyu / rnn-icrag
View on GitHub
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Apr 17, 2024Updated 2 years ago
MLGroupJLU / RWKV-Survey
View on GitHub
The official GitHub page for the survey paper "A Survey of RWKV".
☆33Jan 7, 2025Updated last year
Improbable-AI / orso
View on GitHub
☆18Feb 22, 2025Updated last year
multimodal-art-projection / CriticLean
View on GitHub
☆50Aug 5, 2025Updated 11 months ago
satori-reasoning / Satori
View on GitHub
[ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
☆115Jun 3, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
portal-cornell / muCode
View on GitHub
☆33Oct 2, 2025Updated 9 months ago
hkust-nlp / RL-Verifier-Robustness
View on GitHub
From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.
☆24Oct 7, 2025Updated 9 months ago
GAIR-NLP / MAYE
View on GitHub
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme
☆149Apr 9, 2025Updated last year
Cornell-RL / drpo
View on GitHub
Dateset Reset Policy Optimization
☆30Apr 12, 2024Updated 2 years ago
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated 11 months ago
nirgreshler / bayesian-online-planning
View on GitHub
The code for the paper "A Bayesian Approach to Online Planning" published in ICML 2024.
☆13Jun 17, 2024Updated 2 years ago
GXimingLu / IPA
View on GitHub
Codebase for Inference-Time Policy Adapters
☆25Nov 3, 2023Updated 2 years ago