MBZUAI-IFM / K2-Think-SFTLinks
☆127Updated 3 months ago
Alternatives and similar repositories for K2-Think-SFT
Users that are interested in K2-Think-SFT are comparing it to the libraries listed below
Sorting:
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆241Updated 3 weeks ago
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆167Updated 3 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆304Updated last month
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆353Updated 5 months ago
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆527Updated 3 weeks ago
- ☆301Updated 4 months ago
- RLP: Reinforcement as a Pretraining Objective☆205Updated 2 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆239Updated this week
- Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization☆36Updated 3 weeks ago
- All information and news with respect to Falcon-H1 series☆93Updated 2 months ago
- ☆105Updated 5 months ago
- [ACL 2025] How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆45Updated 4 months ago
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.☆751Updated 2 months ago
- LIMI: Less is More for Agency☆151Updated last month
- ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.☆289Updated last week
- ☆62Updated 5 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆30Updated 11 months ago
- Esoteric Language Models☆108Updated 2 weeks ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆272Updated 2 weeks ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆136Updated 3 months ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆123Updated 4 months ago
- Code for paper "The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning"☆318Updated 3 weeks ago
- The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"☆285Updated last week
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆84Updated 8 months ago
- Sparse Inferencing for transformer based LLMs☆215Updated 4 months ago
- ☆158Updated 7 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆111Updated 7 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆111Updated 7 months ago
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆100Updated 3 months ago
- ☆36Updated 4 months ago