microsoft/experiential_rl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/experiential_rl)

microsoft / experiential_rl

The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1

☆74

Alternatives and similar repositories for experiential_rl

Users that are interested in experiential_rl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

bethgelab / delta-belief-rl
View on GitHub
Official implementation of the ΔBelief-RL method.
☆31Feb 28, 2026Updated 4 months ago
SalesforceAIResearch / CoAct-1
View on GitHub
CoAct-1: Computer-using Agents with Coding as Actions
☆27Jun 2, 2026Updated last month
limenlp / ExeVRM
View on GitHub
Official implementation for the paper "Video-Based Reward Modeling for Computer-Use Agents"
☆16Mar 14, 2026Updated 4 months ago
zenghy96 / Reliable-Source-Approximation
View on GitHub
Reliable Source Approximation: Source-Free Domain Adaptation for Vestibular Schwannoma MRI Segmentation
☆11Dec 28, 2024Updated last year
limenlp / SEA
View on GitHub
Official Implementation for the paper "Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base"
☆27Sep 2, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
zhangxy-2019 / RetroAgent
View on GitHub
RETROAGENT: From Solving to Evolving via Retrospective Dual Intrinsic Feedback
☆26Mar 30, 2026Updated 3 months ago
ritikamangla / QSalience
View on GitHub
https://arxiv.org/abs/2404.10917
☆14Mar 18, 2025Updated last year
limenlp / verl
View on GitHub
AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
☆56Jun 13, 2025Updated last year
facebookresearch / threadweaver
View on GitHub
The implementation for ThreadWeaver Adaptive Threading for Efficient Parallel Reasoning in Language Models
☆67Apr 8, 2026Updated 3 months ago
limenlp / safer-instruct
View on GitHub
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Feb 22, 2024Updated 2 years ago
HiLab-git / SicTTA
View on GitHub
SicTTA: Single Image Continual Test-Time Adaptation for Medical Image Segmentation
☆18Dec 21, 2025Updated 7 months ago
lasgroup / SDPO
View on GitHub
Reinforcement Learning via Self-Distillation (SDPO)
☆1,017Jul 1, 2026Updated 2 weeks ago
hao-ai-lab / research-agent
View on GitHub
☆17Feb 25, 2026Updated 4 months ago
liyucheng09 / LatestEval
View on GitHub
Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.
☆29Feb 17, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
idanshen / Self-Distillation
View on GitHub
☆657Apr 7, 2026Updated 3 months ago
psunlpgroup / ReaLMistake
View on GitHub
This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".
☆32Aug 18, 2024Updated last year
microsoft / SimulatorArena
View on GitHub
☆23May 12, 2026Updated 2 months ago
sunblaze-ucb / reasoning_ladder
View on GitHub
☆35May 16, 2025Updated last year
DLR-SC / style-vectors-for-steering-llms
View on GitHub
Code release for the paper "Style Vectors for Steering Generative Large Language Models", accepted to the Findings of the EACL 2024.
☆37Sep 26, 2024Updated last year
liziniu / cold_start_rl
View on GitHub
Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?
☆20Mar 9, 2025Updated last year
yf-he / EvoTest
View on GitHub
EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems (ICLR'26)
☆24Nov 3, 2025Updated 8 months ago
skydiscover-ai / skydiscover
View on GitHub
AI-Driven Scientific and Algorithmic Discovery
☆582Jun 14, 2026Updated last month
DualityRL / multi-attempt
View on GitHub
☆19Mar 10, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mickelliu / selfplay-redteaming
View on GitHub
☆36Oct 21, 2025Updated 9 months ago
arcee-ai / pybubble
View on GitHub
☆81Feb 18, 2026Updated 5 months ago
CogComp / MultiOpEd
View on GitHub
MULTIOPED: A Corpus of Multi-Perspective News Editorials.
☆12Aug 25, 2021Updated 4 years ago
howard-yen / SLIM
View on GitHub
☆27Jun 22, 2026Updated 3 weeks ago
synvo-ai / HippoCamp
View on GitHub
A benchmark for evaluating contextual agents on realistic multimodal personal-computer environments with profiling and factual-retention …
☆29Apr 2, 2026Updated 3 months ago
kerner-lab / Sparse-GPT-Pretraining
View on GitHub
A codebase for pretraining multi-billion-scale sparse GPTs.
☆24Feb 9, 2026Updated 5 months ago
Gen-Verse / OpenClaw-RL
View on GitHub
OpenClaw-RL: Train any agent simply by talking
☆5,588May 23, 2026Updated last month
Yikai-Liao / efficient_bpe
View on GitHub
An Efficent BPE Algorithm Faster then Hugging Face Tokenizer's Implementation
☆13Sep 9, 2024Updated last year
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆189Jun 5, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
rllm-org / hive
View on GitHub
☆214Apr 28, 2026Updated 2 months ago
Walter0807 / RepBelief
View on GitHub
[ICML 2024] Language Models Represent Beliefs of Self and Others
☆37Sep 26, 2024Updated last year
violetxi / ExpRL
View on GitHub
☆19Jun 16, 2026Updated last month
qizhangli / Gradient-based-Jailbreak-Attacks
View on GitHub
Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs
☆12Nov 7, 2024Updated last year
hongzhouyu / FineMed
View on GitHub
The codebase and some introductions of FineMed.
☆31Sep 11, 2025Updated 10 months ago
sail-sg / Stable-RL
View on GitHub
Rethinking the Trust Region in LLM Reinforcement Learning
☆62Mar 2, 2026Updated 4 months ago
declare-lab / safety-arithmetic
View on GitHub
☆13Jan 14, 2025Updated last year