Curated, opinionated index of post-R1 LLM × Reinforcement Learning. Many deep-dive blog posts cross-linked to many papers — GRPO, DAPO, DPO, PPO, RLHF, GSPO, CISPO, VAPO, Reward Modeling, MoE RL stability, Verifier-Free RL, Training-Free RL, Agentic RL, DeepSeek-R1 reproduction.
☆68Apr 25, 2026Updated last month
Alternatives and similar repositories for rl-llm-nlp
Users that are interested in rl-llm-nlp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLM-MapBook: AI-Powered Maps for Storytelling. Extracts geo-coordinates from books, visualizes on interactive maps, offering immersive st…☆10Aug 27, 2024Updated last year
- (Accepted By EMNLP2022 main long)Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding☆15Oct 29, 2022Updated 3 years ago
- 小红书 / 抖音 / 快手 / 视频号 / B 站 自媒体账号体检工具 — 扫同赛道找对标、拆爆款为什么爆、诊断为什么没人看,顺手给可粘贴的仿写初稿。Claude Code skill。☆81Apr 24, 2026Updated last month
- Official Implementation of Avoiding spurious correlations via logit correction☆17May 6, 2023Updated 3 years ago
- (ACL 2025 Main) Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillat…☆35Aug 23, 2025Updated 9 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Documentation at☆14Mar 27, 2025Updated last year
- Classify image and text with ResNet and BERT models using Pytorch☆13Jul 7, 2020Updated 5 years ago
- ☆17Jun 10, 2025Updated 11 months ago
- Text generation using language models with multiple exit heads☆16Sep 18, 2025Updated 8 months ago
- Trains Sparse Autoencoders based on outputs from language models☆11Oct 7, 2024Updated last year
- ☆16Jul 12, 2024Updated last year
- 2024广西数字开放创新应用大赛,多模态新闻谣言分类☆20Jan 18, 2025Updated last year
- ☆19Jun 21, 2024Updated last year
- [AAAI 2025] Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks☆12Jun 19, 2025Updated 11 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Source codes for the paper "Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning" (PDMER) which p…☆13Mar 24, 2025Updated last year
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆25Nov 29, 2024Updated last year
- Text Adventure Learning Environment Suite - Benchmark to evaluate language models on interactive text environments.☆30May 9, 2026Updated 3 weeks ago
- ☆18Nov 22, 2025Updated 6 months ago
- ☆11Jul 13, 2022Updated 3 years ago
- [EMNLP 2025] Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking☆12Aug 22, 2025Updated 9 months ago
- [EMNLP 2025] HydraRAG: Structured Cross-Source Enhanced Large Language Model Reasoning☆56Nov 12, 2025Updated 6 months ago
- Notes for CS294/194-196: Large Language Model Agents (Fall 2024, UC Berkeley), summarizing 12 lectures on LLM fundamentals, reasoning, pl…☆17Jan 7, 2025Updated last year
- Koishi's Day 2025 Paper (NeurIPS 2025): "Codifying Character Logic in Role-Playing"☆24Jan 15, 2026Updated 4 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 学习他人如何制作漂亮的notebook。「Java学习+面试指南」一份涵盖大部分 Java 程序员所需要掌握的核心知识。☆12Sep 24, 2021Updated 4 years ago
- ☆17Nov 3, 2024Updated last year
- OpenLLMDE: An open source data engineering framework for LLMs☆18Sep 9, 2023Updated 2 years ago
- Targeted Data Generation with Large Language Models☆19Jun 25, 2024Updated last year
- ☆20May 14, 2025Updated last year
- Official repository of the ACL 2024 paper "Rethinking Task-Oriented Dialogue Systems: From Complex Modularity to Zero-Shot Autonomous Age…☆20May 28, 2024Updated 2 years ago
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- ☆21Aug 18, 2024Updated last year
- An AI-powered content conversion tool that transforms text, web content, or HTML code into beautifully designed card images.一款基于AI的内容转换工…☆33Jul 29, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆10Oct 20, 2020Updated 5 years ago
- ☆11Apr 4, 2018Updated 8 years ago
- 算法导论☆10Dec 20, 2021Updated 4 years ago
- When Reasoning Meets Its Laws☆37Jan 2, 2026Updated 5 months ago
- Official Implementation of the paper "Jointly Reinforcing Diversity and Quality in Language Model Generations"☆60May 8, 2026Updated last month
- Generates random utf-8 strings for fuzz t�sting character encoding probl�ms☆11Aug 21, 2015Updated 10 years ago
- ☆63Jan 13, 2025Updated last year