simplescaling / s1
s1: Simple test-time scaling
☆6,051Updated 3 weeks ago
Alternatives and similar repositories for s1:
Users that are interested in s1 are comparing it to the libraries listed below
- Clean, minimal, accessible reproduction of DeepSeek R1-Zero☆11,339Updated 2 weeks ago
- This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data☆3,223Updated this week
- verl: Volcano Engine Reinforcement Learning for LLMs☆5,693Updated this week
- Sky-T1: Train your own O1 preview model within $450☆3,149Updated this week
- Democratizing Reinforcement Learning for LLMs☆2,113Updated last month
- ☆3,242Updated 3 weeks ago
- Fully open data curation for reasoning models☆1,576Updated last week
- SGLang is a fast serving framework for large language models and vision language models.☆12,427Updated this week
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆9,282Updated this week
- An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl☆5,145Updated last month
- A live stream development of RL tunning for LLM agents☆1,883Updated this week
- Fully open reproduction of DeepSeek-R1☆23,242Updated this week
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆2,415Updated last week
- A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.☆6,607Updated this week
- Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)☆3,653Updated this week
- Witness the aha moment of VLM with less than $3.☆3,376Updated 3 weeks ago
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆14,067Updated this week
- Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.☆4,800Updated this week
- Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.☆6,316Updated last week
- An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)☆5,919Updated this week
- An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large…☆14,924Updated this week
- Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation☆6,926Updated 3 weeks ago
- Official Repo for Open-Reasoner-Zero☆1,667Updated 3 weeks ago
- FlashMLA: Efficient MLA decoding kernels☆11,369Updated 3 weeks ago
- ☆3,340Updated last month
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆4,201Updated 2 months ago
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆1,389Updated this week
- 🤗 smolagents: a barebones library for agents that think in python code.☆15,909Updated this week
- Scalable RL solution for advanced reasoning of language models☆1,419Updated last week
- DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding☆4,628Updated last month