MBZUAI-IFM / K2-Think-SFTLinks

☆126

Alternatives and similar repositories for K2-Think-SFT

Users that are interested in K2-Think-SFT are comparing it to the libraries listed below

Sorting:

zhengkid / Parallel-R1
The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"
☆233Updated this week
tiiuae / Falcon-H1
All information and news with respect to Falcon-H1 series
☆93Updated last month
SakanaAI / natural_niches
The code repository of the paper: Competition and Attraction Improve Model Fusion
☆165Updated 2 months ago
SakanaAI / RLT
Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.
☆349Updated 4 months ago
microsoft / ArchScale
Simple & Scalable Pretraining for Neural Architecture Research
☆299Updated 2 weeks ago
WeiboAI / VibeThinker
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
☆155Updated last week
MetaStone-AI / XBai-o4
☆300Updated 3 months ago
facebookresearch / cwm
Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.
☆712Updated last month
eqimp / hogwild_llm
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
☆129Updated 3 months ago
StigLidu / DualDistill
[EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"
☆101Updated 2 months ago
NimbleEdge / sparse_transformers
Sparse Inferencing for transformer based LLMs
☆208Updated 3 months ago
letta-ai / sleep-time-compute
accompanying material for sleep-time compute paper
☆117Updated 6 months ago
microsoft / GRIN-MoE
GRadient-INformed MoE
☆264Updated last year
shaochenze / calm
Official implementation of "Continuous Autoregressive Language Models"
☆584Updated last week
zjunlp / DynamicKnowledgeCircuits
[ACL 2025] How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
☆44Updated 4 months ago
GAIR-NLP / LIMI
LIMI: Less is More for Agency
☆148Updated last month
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆101Updated this week
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆136Updated 5 months ago
thu-nics / C2C
The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"
☆259Updated 2 weeks ago
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆110Updated 6 months ago
nexusflowai / NexusBench
Nexusflow function call, tool use, and agent benchmarks.
☆29Updated 11 months ago
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆185Updated 9 months ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆98Updated 6 months ago
NVlabs / Jet-Nemotron
☆702Updated last month
google-deepmind / latent-multi-hop-reasoning
[ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?
☆82Updated 8 months ago
reka-ai / rekaquant
☆62Updated 4 months ago
IST-DASLab / gptq-gguf-toolkit
Efficient non-uniform quantization with GPTQ for GGUF
☆53Updated 2 months ago
NVlabs / RLP
RLP: Reinforcement as a Pretraining Objective
☆200Updated last month
janhq / visual-thinker
☆180Updated 3 months ago
NVlabs / UniversalDeepResearch
Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)
☆449Updated 2 months ago