MBZUAI-IFM / K2-Think-SFTLinks
☆131Updated 4 months ago
Alternatives and similar repositories for K2-Think-SFT
Users that are interested in K2-Think-SFT are comparing it to the libraries listed below
Sorting:
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆169Updated 5 months ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆255Updated 2 months ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆358Updated 7 months ago
- [ICLR'26] The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"☆324Updated last week
- All information and news with respect to Falcon-H1 series☆106Updated 3 months ago
- WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups ov…☆597Updated 3 weeks ago
- [ICLR2026] Test-Time Scaling with Reflective Generative Model☆302Updated last week
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆565Updated 2 months ago
- Code for Bolmo: Byteifying the Next Generation of Language Models☆116Updated last month
- Official Project Page for Deep Delta Learning (https://huggingface.co/papers/2601.00417)☆320Updated this week
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆102Updated 5 months ago
- Data recipes and robust infrastructure for training AI agents☆90Updated this week
- Simple & Scalable Pretraining for Neural Architecture Research☆307Updated last month
- RapidFire AI: Rapid AI Customization from RAG to Fine-Tuning☆135Updated this week
- Developer Asset Hub for NVIDIA Nemotron — A one-stop resource for training recipes, usage cookbooks, datasets, and full end-to-end refere…☆374Updated last week
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆261Updated this week
- A truly open version of gpt-oss which shows the entire pre-training from scratch☆85Updated 5 months ago
- LIMI: Less is More for Agency☆159Updated 3 months ago
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆193Updated 3 weeks ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆76Updated last year
- Inference, Fine Tuning and many more recipes with Gemma family of models☆279Updated 6 months ago
- DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL☆241Updated 4 months ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆227Updated 2 months ago
- [ICLR 2026] Official PyTorch Implementation of RLP: Reinforcement as a Pretraining Objective☆226Updated last week
- Universal Reasoning Model☆121Updated 2 weeks ago
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.☆806Updated last month
- Digital Red Queen: Adversarial Program Evolution in Core War with LLMs☆173Updated 3 weeks ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆125Updated 5 months ago
- ☆165Updated 2 months ago
- ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.☆623Updated last week