Tencent-Hunyuan/CL-bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Tencent-Hunyuan/CL-bench)

Tencent-Hunyuan / CL-bench

CL-bench: A Benchmark for Context Learning

☆573

Alternatives and similar repositories for CL-bench

Users that are interested in CL-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

thunlp / SE-Bench
View on GitHub
Official repo for "SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization"
☆28Mar 24, 2026Updated 4 months ago
THUDM / slime
View on GitHub
slime is an LLM post-training framework for RL Scaling.
☆7,629Updated this week
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,654Updated this week
PeterGriffinJin / Search-R1
View on GitHub
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
☆5,153Nov 13, 2025Updated 8 months ago
Shichun-Liu / Agent-Memory-Paper-List
View on GitHub
The paper list of "Memory in the Age of AI Agents: A Survey"
☆2,265Mar 4, 2026Updated 4 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ypwang61 / ThetaEvolve
View on GitHub
ThetaEvolve: Test-time Learning on Open Problems, enabling RL training on AlphaEvolve/OpenEvolve and emphasizing scaling test-time comput…
☆170Feb 27, 2026Updated 4 months ago
Tencent-Hunyuan / Hy3-preview
View on GitHub
Hy3 preview (295B A21B), a leading reasoning and agent model in its size, with great cost efficiency
☆456Apr 23, 2026Updated 3 months ago
areal-project / AReaL
View on GitHub
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
☆5,599Updated this week
thunlp / OPD
View on GitHub
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
☆844Jun 29, 2026Updated 3 weeks ago
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated 11 months ago
benchflow-ai / skillsbench
View on GitHub
SkillsBench evaluates how well skills work and how effective agents are at using them.
☆1,577Updated this week
Gen-Verse / Open-AgentRL
View on GitHub
RLAnything (ICML 2026) & AutoTool (ICML 2026), DemyAgent: Open-Source RL for LLMs and Agentic Scenarios
☆592Jun 12, 2026Updated last month
Gen-Verse / OpenClaw-RL
View on GitHub
OpenClaw-RL: Train any agent simply by talking
☆5,606May 23, 2026Updated 2 months ago
alibaba / ROLL
View on GitHub
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
☆3,325Updated this week
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
langfengQ / verl-agent
View on GitHub
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in…
☆2,153Jun 9, 2026Updated last month
BytedTsinghua-SIA / MemAgent
View on GitHub
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
☆1,085May 12, 2026Updated 2 months ago
CharlesQ9 / Self-Evolving-Agents
View on GitHub
☆1,259Oct 15, 2025Updated 9 months ago
hkust-nlp / Toolathlon
View on GitHub
[ICLR 2026] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
☆440Updated this week
bingreeky / MemEvolve
View on GitHub
[ICML'26] MemEvolve & EvolveLab
☆255May 5, 2026Updated 2 months ago
BytedTsinghua-SIA / DAPO
View on GitHub
An Open-source RL System from ByteDance Seed and Tsinghua AIR
☆1,846May 11, 2025Updated last year
real-absolute-AI / LongRLVR
View on GitHub
[ICLR 2026] LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards.
☆19Mar 16, 2026Updated 4 months ago
claw-eval / claw-eval
View on GitHub
Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.
☆735May 17, 2026Updated 2 months ago
RUC-NLPIR / ARPO
View on GitHub
[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)
☆1,092Jul 13, 2026Updated last week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,848Jul 14, 2026Updated last week
lasgroup / SDPO
View on GitHub
Reinforcement Learning via Self-Distillation (SDPO)
☆1,021Jul 1, 2026Updated 3 weeks ago
Visual-Agent / DeepEyes
View on GitHub
☆1,251Nov 20, 2025Updated 8 months ago
verl-project / verl-recipe
View on GitHub
A set of examples based on verl for end-to-end RL training recipes.
☆311Updated this week
deepseek-ai / Engram
View on GitHub
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
☆4,561Jan 14, 2026Updated 6 months ago
qhjqhj00 / MemoBrain
View on GitHub
Executive Memory for Coherent Long-Horizon Reasoning!
☆85Jan 14, 2026Updated 6 months ago
harbor-framework / terminal-bench
View on GitHub
A benchmark for LLMs on complicated tasks in the terminal
☆2,483Jul 11, 2026Updated 2 weeks ago
GAIR-NLP / DeepResearcher
View on GitHub
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
☆783May 10, 2026Updated 2 months ago
metaevo-ai / meta-context-engineering
View on GitHub
[ICML 2026] Meta Context Engineering via Agentic Skill Evolution
☆142May 4, 2026Updated 2 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
idanshen / Self-Distillation
View on GitHub
☆663Apr 7, 2026Updated 3 months ago
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,493Mar 9, 2026Updated 4 months ago
aisa-group / PostTrainBench
View on GitHub
Measuring how well CLI agents like Claude Code or Codex CLI can post-train base LLMs on a single H100 GPU in 10 hours
☆467Updated this week
Alibaba-NLP / DeepResearch
View on GitHub
Tongyi Deep Research, the Leading Open-source Deep Research Agent
☆19,719Feb 27, 2026Updated 4 months ago
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,081Updated this week
UCSB-NLP-Chang / Skill-Usage
View on GitHub
☆46Apr 8, 2026Updated 3 months ago
TsinghuaC3I / Awesome-RL-for-LRMs
View on GitHub
A Survey of Reinforcement Learning for Large Reasoning Models
☆2,469Nov 9, 2025Updated 8 months ago