Curated, opinionated index of post-R1 LLM × Reinforcement Learning. Many deep-dive blog posts cross-linked to many papers — GRPO, DAPO, DPO, PPO, RLHF, GSPO, CISPO, VAPO, Reward Modeling, MoE RL stability, Verifier-Free RL, Training-Free RL, Agentic RL, DeepSeek-R1 reproduction.
☆66Apr 25, 2026Updated 2 weeks ago
Alternatives and similar repositories for rl-llm-nlp
Users that are interested in rl-llm-nlp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13May 12, 2025Updated 11 months ago
- LLM-MapBook: AI-Powered Maps for Storytelling. Extracts geo-coordinates from books, visualizes on interactive maps, offering immersive st…☆12Aug 27, 2024Updated last year
- repo for the paper titled “CodeGen4Libs: A Two-Stage Approach for Library-Oriented Code Generation”☆14Oct 4, 2023Updated 2 years ago
- (ACL 2025 Main) Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillat…☆34Aug 23, 2025Updated 8 months ago
- Documentation at☆14Mar 27, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆17Jun 10, 2025Updated 10 months ago
- Text generation using language models with multiple exit heads☆16Sep 18, 2025Updated 7 months ago
- Trains Sparse Autoencoders based on outputs from language models☆11Oct 7, 2024Updated last year
- 2024广西数字开放创新应用大赛,多模态新闻谣言分类☆19Jan 18, 2025Updated last year
- A Code Efficiency Benchmark for Code Generation☆14May 26, 2025Updated 11 months ago
- ☆20Jun 21, 2024Updated last year
- [AAAI 2025] Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks☆12Jun 19, 2025Updated 10 months ago
- Source codes for the paper "Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning" (PDMER) which p…☆13Mar 24, 2025Updated last year
- Code and Data for ACL 2025 Paper "Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework".☆25Oct 3, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆25Nov 29, 2024Updated last year
- Text Adventure Learning Environment Suite - Benchmark to evaluate language models on interactive text environments.☆27May 1, 2026Updated last week
- ☆18Nov 22, 2025Updated 5 months ago
- GPTCloneBench is a clone detection benchmark based on SemanticCloneBench and GPT.☆16Feb 5, 2025Updated last year
- ☆11Jul 13, 2022Updated 3 years ago
- [EMNLP 2025] Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking☆12Aug 22, 2025Updated 8 months ago
- 学习他人如何制作漂亮的notebook。「Java学习+面试指南」一份涵盖大部分 Java 程序员所需要掌握的核心知识。☆11Sep 24, 2021Updated 4 years ago
- A curated collection of research and techniques for protecting intellectual property of large language models, including watermarking, fi…☆47Feb 15, 2026Updated 2 months ago
- OpenLLMDE: An open source data engineering framework for LLMs☆18Sep 9, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Targeted Data Generation with Large Language Models☆19Jun 25, 2024Updated last year
- ☆20May 14, 2025Updated 11 months ago
- Code for Multi-Aspect Cross-modal Quantization for Generative Recommendation. (AAAI 2026 Oral)☆38Dec 9, 2025Updated 5 months ago
- Official repository of the ACL 2024 paper "Rethinking Task-Oriented Dialogue Systems: From Complex Modularity to Zero-Shot Autonomous Age…☆20May 28, 2024Updated last year
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- ☆21Aug 18, 2024Updated last year
- QiDiHui: RAG, appbuilder, ErnieBot, multi-model, 十万个为什么☆21Aug 7, 2024Updated last year
- An AI-powered content conversion tool that transforms text, web content, or HTML code into beautifully designed card images.一款基于AI的内容转换工…☆34Jul 29, 2025Updated 9 months ago
- JLU drcom client written in golang.☆12Sep 4, 2019Updated 6 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Official Implementation of the paper "Jointly Reinforcing Diversity and Quality in Language Model Generations"☆58Apr 13, 2026Updated 3 weeks ago
- Used for onset picking☆11Oct 14, 2019Updated 6 years ago
- ☆11Apr 4, 2018Updated 8 years ago
- ☆62Jan 13, 2025Updated last year
- When Reasoning Meets Its Laws☆37Jan 2, 2026Updated 4 months ago
- ☆135Mar 4, 2025Updated last year
- 吴恩达 LangChain 课程中英双语字幕☆16Jun 3, 2023Updated 2 years ago