CLR-Lab/SimKO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CLR-Lab/SimKO)

CLR-Lab / SimKO

SimKO: Simple Pass@K Policy Optimization

☆31

Alternatives and similar repositories for SimKO

Users that are interested in SimKO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Sphere-AI-Lab / PEFT-Arena
View on GitHub
Official repository of PEFT-Arena: Understanding Parameter-Efficient Finetuning from a Stability-Plasticity Perspective
☆29Jun 13, 2026Updated last month
Sphere-AI-Lab / FormalMATH-Bench
View on GitHub
Repository of <FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models>
☆75Jan 8, 2026Updated 6 months ago
lzzcd001 / vggflow
View on GitHub
Official Implementation of VGG-Flow (NeurIPS 2025; https://arxiv.org/abs/2512.05116)
☆22Mar 11, 2026Updated 4 months ago
Sphere-AI-Lab / orbit
View on GitHub
Stable and Efficient Reinforcement Learning for Trillion-Parameter LLMs
☆148Jun 28, 2026Updated 3 weeks ago
ybwang119 / label_recovery
View on GitHub
[ICLR 2024] Towards Elminating Hard Label Constraints in Gradient Inverision Attacks
☆14Feb 6, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Evanwu1125 / LiteCoT
View on GitHub
☆17Jun 10, 2025Updated last year
TomSheng21 / R-TPT
View on GitHub
CVPR 2025 - R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning
☆22Aug 28, 2025Updated 10 months ago
wy1iu / OPT
View on GitHub
Implementation for <Orthogonal Over-Parameterized Training> in CVPR'21.
☆22Jul 16, 2021Updated 5 years ago
sgp-bench / sgp-bench
View on GitHub
☆30Jul 14, 2025Updated last year
Trae1ounG / Pretrain_Space_RLVR
View on GitHub
[arxiv: 2604.14142] From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space
☆17Apr 16, 2026Updated 3 months ago
TianHongZXY / RLVR-Decomposed
View on GitHub
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆166Mar 2, 2026Updated 4 months ago
Trae1ounG / BuPO
View on GitHub
[arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies
☆60Feb 6, 2026Updated 5 months ago
Sphere-AI-Lab / OrthoMerge
View on GitHub
Implementation of <Orthogonal Model Merging>
☆33May 27, 2026Updated last month
MozerWang / promISe
View on GitHub
[COLING 2024 (Oral)] PromISe:Releasing the Capabilities of LLMs with Prompt Introspective Search
☆23Aug 26, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
cjj826 / GoalAct
View on GitHub
The repo for our paper: Enhancing LLM-Based Agents via Global Planning and Hierarchical Execution (NCIIP 2025 Best Paper)
☆17Aug 18, 2025Updated 11 months ago
Sphere-AI-Lab / fda
View on GitHub
Implementation of <Model Merging with Functional Dual Anchors>
☆46Nov 23, 2025Updated 8 months ago
Hannibal046 / PlugLM
View on GitHub
[ACL2023] Source code for Decouple knowledge from paramters for plug-and-play language modeling
☆20Sep 18, 2023Updated 2 years ago
scaleapi / SWE-Interact
View on GitHub
New testbed of interactive SWE tasks for coding agents, set in a realistic multi-turn developer driven environment
☆24Jun 30, 2026Updated 3 weeks ago
Sphere-AI-Lab / poet
View on GitHub
Implementation for POET and POET-X for LLM pretraining
☆38Jun 9, 2026Updated last month
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated last year
alif-munim / minOFT
View on GitHub
A minimal re-implementation of orthogonal fine-tuning (OFT), a diffusion method, for LLMs. Based on nanoGPT and minLoRA.
☆14Nov 17, 2023Updated 2 years ago
GioGioBond / NBCEonChatGLM6b
View on GitHub
(NBCE)Naive Bayes-based Context Extension on ChatGLM-6b
☆15Jun 7, 2023Updated 3 years ago
MozerWang / DEMO
View on GitHub
[ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling
☆22Dec 16, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Adaxry / Unified_Layer_Skipping
View on GitHub
☆15Apr 11, 2024Updated 2 years ago
which47 / LLMCL
View on GitHub
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning
☆38Nov 17, 2024Updated last year
liziniu / cold_start_rl
View on GitHub
Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?
☆20Mar 9, 2025Updated last year
hkgc-1 / GHPO
View on GitHub
☆62Jul 21, 2025Updated last year
huawei-lin / RapidIn
View on GitHub
RapidIn: Scalable Influence Estimation for Large Language Models (LLMs). The implementation for paper "Token-wise Influential Training Da…
☆22Mar 10, 2026Updated 4 months ago
yulonghui / MOCA
View on GitHub
Official implementation of "Continual Learning by Modeling Intra-Class Variation" (MOCA). [TMLR 2023]
☆16Mar 3, 2023Updated 3 years ago
rhyang2021 / CogRouter
View on GitHub
Source code for our paper: "Think Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM Agents".
☆24Feb 20, 2026Updated 5 months ago
YBIO / IDEC
View on GitHub
☆16Feb 21, 2025Updated last year
TomSheng21 / tta-vlm
View on GitHub
[NeurIPS 2025 Datasets & Benchmarks Track] The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models
☆39Oct 26, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jamessealesmith / ConStruct-VL
View on GitHub
PyTorch code for the CVPR'23 paper: "ConStruct-VL: Data-Free Continual Structured VL Concepts Learning"
☆13Feb 5, 2024Updated 2 years ago
bloomberg / MixCE-acl2023
View on GitHub
Implementation of MixCE method described in ACL 2023 paper by Zhang et al.
☆20May 29, 2023Updated 3 years ago
MozerWang / AMPO
View on GitHub
[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents
☆51Feb 2, 2026Updated 5 months ago
Little-girl-1992 / RAE
View on GitHub
基于tensorflow搭建的神经网络recursive autuencode，用于实现句子聚类
☆12Jul 7, 2017Updated 9 years ago
GeWu-Lab / Patch-Matters
View on GitHub
[CVPR2025] Code Release of Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
☆25Jun 17, 2025Updated last year
merlresearch / SOCKET
View on GitHub
Code for MERL's ECCV 2022 paper on Cross-Modal Knowledge Transfer Without Task-Relevant Source Data
☆11Jul 19, 2022Updated 4 years ago
kxfan2002 / Reagent
View on GitHub
Agent-RRM: Exploring Reasoning Reward Model for Agents
☆70Mar 17, 2026Updated 4 months ago