ShuheWang1998/Reinforcement-Learning-Enhanced-LLMs-A-Survey

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ShuheWang1998/Reinforcement-Learning-Enhanced-LLMs-A-Survey)

ShuheWang1998 / Reinforcement-Learning-Enhanced-LLMs-A-Survey

☆217

Alternatives and similar repositories for Reinforcement-Learning-Enhanced-LLMs-A-Survey

Users that are interested in Reinforcement-Learning-Enhanced-LLMs-A-Survey are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SkyworkAI / Skywork-R1V
View on GitHub
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.
☆3,159Dec 15, 2025Updated 7 months ago
ShuheSH / A-Survey-of-the-Reasoning-Abilities-of-LLMs
View on GitHub
☆28Mar 4, 2025Updated last year
deepreinforce-ai / IterX-tutorials
View on GitHub
tutorials for IterX
☆39Feb 26, 2026Updated 5 months ago
deepreinforce-ai / CUDA-L1
View on GitHub
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning
☆315Nov 3, 2025Updated 8 months ago
Atrewin / PGen
View on GitHub
Implementation of our paper "Scaling Back-Translation with Domain Text Generation for Sign Language Gloss Translation". Accepted in EACL …
☆11May 22, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ShuheSH / FaceID-6M
View on GitHub
☆54Apr 11, 2025Updated last year
yongchanghao / multi-task-nat
View on GitHub
☆11Jul 17, 2021Updated 5 years ago
Atrewin / SignXmDA
View on GitHub
This is the official code repository for the paper 'Cross-modality Data Augmentation for End-to-End Sign Language Translation'. Accepted…
☆16Oct 18, 2023Updated 2 years ago
jiah-li / magic
View on GitHub
The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.
☆15Dec 16, 2024Updated last year
mega002 / qdmr-based-question-generation
View on GitHub
The official code of TACL 2022, "Break, Perturb, Build: Automatic Perturbation of Reasoning Paths Through Question Decomposition".
☆12Oct 18, 2021Updated 4 years ago
xiaoya-li / Instruction-Tuning-Survey
View on GitHub
Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`
☆232Aug 10, 2025Updated 11 months ago
kenchan0226 / FineGrainedFact
View on GitHub
Official implementation of the ACL Findings 2023 paper: Interpretable Automatic Fine-grained Inconsistency Detection in Text Summarizatio…
☆15Jan 25, 2024Updated 2 years ago
VMnK-Run / MARVEL
View on GitHub
[ASE2024] Mutual Learning-Based Framework for Enhancing Robustness of Code Models via Adversarial Training
☆11Sep 13, 2024Updated last year
ruili33 / SEC
View on GitHub
Source code for paper Are Human-generated Demonstrations Necessary for In-context Learning
☆12Jan 21, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
nii-nlp / med-eval
View on GitHub
Evaluation Pipeline for medical tasks.
☆12Apr 8, 2026Updated 3 months ago
Feng-Jay / GiantRepair
View on GitHub
Artifact for TOSEM Submission: GiantRepair
☆12Jun 26, 2024Updated 2 years ago
W924 / hit_Database
View on GitHub
2019年春哈工大数据库
☆12Nov 21, 2019Updated 6 years ago
intervention-training / int
View on GitHub
☆16Feb 4, 2026Updated 5 months ago
DeepExperience / agent2world
View on GitHub
🪐 Agent2World: Learning to Generate Symbolic World Models via Adaptive Multi-Agent Feedback
☆23Jan 29, 2026Updated 6 months ago
pkuzqh / ICSE23Repair
View on GitHub
An implementation of Tare.
☆12Feb 23, 2024Updated 2 years ago
CUHK-ARISE / LLMPersonality
View on GitHub
Code and data for the paper: On the Reliability of Psychological Scales on Large Language Models
☆31Dec 15, 2025Updated 7 months ago
zwhe99 / FeedbackMT
View on GitHub
Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"
☆22Jun 28, 2024Updated 2 years ago
wxjiao / InstructMT
View on GitHub
A collection of instruction data and scripts for machine translation.
☆20Sep 23, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
wxjiao / Data-Rejuvenation
View on GitHub
Implementation of our paper "Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation" in EMNLP-2020.
☆23Aug 20, 2021Updated 4 years ago
yuanshi2025 / 2025YuanShi
View on GitHub
2025年科学院+工程院院士候选人负面网络舆情
☆41Sep 6, 2025Updated 10 months ago
dxhou / CoAct
View on GitHub
☆32Jul 8, 2024Updated 2 years ago
Mihir3009 / LogicBench
View on GitHub
LogicBench is a natural language question-answering dataset consisting of 25 different reasoning patterns spanning over propositional, fi…
☆40May 2, 2024Updated 2 years ago
yutuer21 / quantumzero
View on GitHub
☆15Feb 24, 2022Updated 4 years ago
yueb17 / PEMN
View on GitHub
☆20Nov 27, 2022Updated 3 years ago
XinyuanLu00 / SciTab
View on GitHub
The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"
☆23Dec 21, 2023Updated 2 years ago
DeepExperience / REAL
View on GitHub
Rewards as Labels: Revisiting RLVR from a Classification Perspective
☆24Jun 26, 2026Updated last month
1191000814 / SA-Materials
View on GitHub
2022年春哈工大软件架构与中间件课程资料
☆19Dec 18, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
zzpustc / CC-SAM
View on GitHub
This is the implementation of our CVPR'23 paper "Class-Conditional Sharpness-Aware Minimization for Deep Long-Tailed Recognition".
☆20Dec 16, 2023Updated 2 years ago
XMUDeepLIT / TTCS
View on GitHub
The code implementation for TTCS: Test-Time Curriculum Synthesis for Self-Evolving.
☆51Apr 22, 2026Updated 3 months ago
zhaoxlpku / PromptCoT
View on GitHub
☆17Apr 10, 2025Updated last year
kenchan0226 / dual_view_review_sum
View on GitHub
Code for the SIGIR 2020 paper "A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss"
☆21Feb 3, 2021Updated 5 years ago
sitaocheng / DERL
View on GitHub
The code repo for the paper "Differentiable Evolutionary Reinforcement Learning"
☆18Jan 6, 2026Updated 6 months ago
skai-research / ScholarEval
View on GitHub
Official code and data for the paper "ScholarEval: Research Idea Evaluation Grounded in Literature."
☆20Oct 28, 2025Updated 9 months ago
NUST-Machine-Intelligence-Laboratory / NPN
View on GitHub
☆12Dec 13, 2023Updated 2 years ago