simplescaling / s1Links
s1: Simple test-time scaling
☆6,635Updated 7 months ago
Alternatives and similar repositories for s1
Users that are interested in s1 are comparing it to the libraries listed below
Sorting:
- Democratizing Reinforcement Learning for LLMs☆5,060Updated this week
- Sky-T1: Train your own O1 preview model within $450☆3,370Updated 6 months ago
- Minimal reproduction of DeepSeek R1-Zero☆12,646Updated 9 months ago
- ☆3,466Updated 10 months ago
- Simple RL training for reasoning☆3,829Updated last month
- Fully open reproduction of DeepSeek-R1☆25,848Updated 2 months ago
- Fully open data curation for reasoning models☆2,200Updated 2 months ago
- Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation☆7,961Updated 8 months ago
- Witness the aha moment of VLM with less than $3.☆4,027Updated 8 months ago
- verl: Volcano Engine Reinforcement Learning for LLMs☆18,963Updated this week
- AllenAI's post-training codebase☆3,551Updated last week
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆3,889Updated 2 months ago
- ☆4,340Updated 6 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆3,538Updated 2 months ago
- Awesome Reasoning LLM Tutorial/Survey/Guide☆2,280Updated 3 months ago
- Large Concept Models: Language modeling in a sentence representation space☆2,332Updated last year
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆26,440Updated 3 weeks ago
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆2,503Updated last week
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆3,314Updated 6 months ago
- Everything about the SmolLM and SmolVLM family of models☆3,594Updated 3 weeks ago
- Modeling, training, eval, and inference code for OLMo☆6,299Updated 2 months ago
- Scalable RL solution for advanced reasoning of language models☆1,803Updated 10 months ago
- A live stream development of RL tunning for LLM agents☆3,877Updated 3 months ago
- SGLang is a high-performance serving framework for large language models and multimodal models.☆23,091Updated this week
- Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and pe…☆3,903Updated 7 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆3,074Updated last week
- [COLM 2025] LIMO: Less is More for Reasoning☆1,061Updated 6 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,491Updated 5 months ago
- Renderer for the harmony response format to be used with gpt-oss☆4,159Updated last month
- MoBA: Mixture of Block Attention for Long-Context LLMs☆2,038Updated 10 months ago