ASTRAL-Group/data-efficient-llm-rl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ASTRAL-Group/data-efficient-llm-rl)

ASTRAL-Group / data-efficient-llm-rl

☆45

Alternatives and similar repositories for data-efficient-llm-rl

Users that are interested in data-efficient-llm-rl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ASTRAL-Group / SVIP
View on GitHub
SVIP: Towards Verifiable Inference of Open-Source Large Language Models
☆15Jun 3, 2025Updated last year
ASTRAL-Group / MonitorBench
View on GitHub
[COLM 2026] Official implementation for "MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Mo…
☆20Apr 23, 2026Updated 3 months ago
Infini-AI-Lab / GRESO
View on GitHub
☆82Jun 8, 2026Updated last month
shiweijiezero / R3L
View on GitHub
☆23Apr 5, 2026Updated 3 months ago
YennNing / MC-Search
View on GitHub
[ICLR 2026 Oral] Evaluating and Enhancing Multimodal Agentic Search with Structured Long Reasoning Chains
☆38Apr 11, 2026Updated 3 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
DynaMath / DynaMath
View on GitHub
A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
☆30Nov 25, 2024Updated last year
feiyang-k / AutoScale
View on GitHub
Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…
☆14Aug 8, 2025Updated 11 months ago
limenlp / verl
View on GitHub
AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
☆56Jun 13, 2025Updated last year
ThakiCloud / SKILLRET
View on GitHub
Skill retrieval benchmark dataset and evaluation code.
☆19May 8, 2026Updated 2 months ago
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
divelab / E2H-Reasoning
View on GitHub
[ICLR' 26] Implementation of "Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning"
☆24May 28, 2026Updated last month
liushulinle / MarsRL
View on GitHub
MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism
☆18Nov 18, 2025Updated 8 months ago
MasterVito / DAC-RL
View on GitHub
Official Repo for DAC-RL: Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability
☆16Feb 26, 2026Updated 5 months ago
zhangxy-2019 / RetroAgent
View on GitHub
RETROAGENT: From Solving to Evolving via Retrospective Dual Intrinsic Feedback
☆26Mar 30, 2026Updated 3 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
shizhouxing / Fast-Certified-Robust-Training
View on GitHub
[NeurIPS 2021] Fast Certified Robust Training with Short Warmup
☆25Jun 7, 2025Updated last year
zijian678 / TDD
View on GitHub
☆14Apr 22, 2024Updated 2 years ago
VITA-Group / Linearity-Grafting
View on GitHub
[ICML 2022] "Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness" by Tianlong Chen*, Huan Zhang*, Zhenyu Zhang, Shiyu…
☆16Jun 22, 2022Updated 4 years ago
jaehunjung1 / prismatic-synthesis
View on GitHub
☆28May 27, 2025Updated last year
svenpeter42 / LightGBM-CEGB
View on GitHub
Fork of Microsoft/LightGBM to include support for the CEGB (Cost Efficient Gradient Boosting) algorithm. Original repository at https://g…
☆13Jun 30, 2017Updated 9 years ago
ASTRAL-Group / ASTRA
View on GitHub
[CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbre…
☆62Jul 5, 2025Updated last year
liziniu / GEM
View on GitHub
Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)
☆58May 12, 2025Updated last year
ASTRAL-Group / AlphaOne
View on GitHub
[EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
☆89Jun 10, 2025Updated last year
locuslab / intermediate_robustness
View on GitHub
☆15Dec 7, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
2003pro / TAGCOS
View on GitHub
This is the official implementation of TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data
☆13Jul 21, 2024Updated 2 years ago
AI-secure / CoPur
View on GitHub
CoPur: Certifiably Robust Collaborative Inference via Feature Purification (NeurIPS 2022)
☆11Dec 7, 2022Updated 3 years ago
w-yibo / VTC-R1
View on GitHub
VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning.
☆26Updated this week
zhao-ht / ConvexCertify
View on GitHub
This is the code of our work CISS Certified Robustness Against Natural Language Attacks by Causal Intervention published on ICML 2022
☆11Dec 6, 2022Updated 3 years ago
chenyiqun / Agentic-RAG
View on GitHub
This is the code of a agentic rag method with dynamic workflow.
☆14Jan 22, 2026Updated 6 months ago
UNITES-Lab / flash-molecular-dynamics
View on GitHub
Fast and accurate coarse-grained neural network molecular dynamics
☆15Jul 13, 2026Updated last week
Centrattic / global-cot-analysis
View on GitHub
Global CoT Analysis: Initial attempts to uncover patterns across many chains of thought
☆20Feb 10, 2026Updated 5 months ago
VNN-COMP / vnncomp2023_benchmarks
View on GitHub
Benchmarks for the VNN Comp 2023
☆16Jun 7, 2024Updated 2 years ago
alchemistyzz / PeRL
View on GitHub
[NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"
☆30Mar 30, 2026Updated 3 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
pUmpKin-Co / ComplementaryRL
View on GitHub
Co-evolving policy actors and experience extractors for efficient experience-driven agent RL
☆51May 12, 2026Updated 2 months ago
Zinoex / bound_propagation
View on GitHub
Linear and interval bound propagation in Pytorch with easy-to-use API and GPU support.
☆11May 14, 2026Updated 2 months ago
AKuzina / defend_vae_mcmc
View on GitHub
Code repository of the paper "Alleviating Adversarial Attacks on Variational Autoencoders with MCMC" published at NeurIPS 2022. https://a…
☆10Dec 14, 2022Updated 3 years ago
TingdiRen / LRU_pytorch
View on GitHub
This repository is pytorch version implement of LRU from the paper "Resurrecting Recurrent Neural Networks for Long Sequences" (https://a…
☆12May 22, 2023Updated 3 years ago
THU-KEG / VerIF
View on GitHub
[EMNLP 2025] Verification Engineering for RL in Instruction Following
☆57Mar 30, 2026Updated 3 months ago
listentm / CROWDSELECT
View on GitHub
We systematically studied the influencing factors when LLM generates benchmarks,By using our code, you can generate high-quality QA datas…
☆20May 20, 2025Updated last year
TianHongZXY / RLVR-Decomposed
View on GitHub
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆166Mar 2, 2026Updated 4 months ago