Jiayi-Pan / TinyZero
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
☆11,339Updated 2 weeks ago
Alternatives and similar repositories for TinyZero:
Users that are interested in TinyZero are comparing it to the libraries listed below
- Fully open reproduction of DeepSeek-R1☆23,242Updated this week
- verl: Volcano Engine Reinforcement Learning for LLMs☆5,693Updated this week
- This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data☆3,223Updated this week
- s1: Simple test-time scaling☆6,051Updated 3 weeks ago
- SGLang is a fast serving framework for large language models and vision language models.☆12,427Updated this week
- Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥☆35,893Updated this week
- A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.☆6,607Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆42,924Updated this week
- ☆3,242Updated 3 weeks ago
- Sky-T1: Train your own O1 preview model within $450☆3,149Updated this week
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆9,282Updated this week
- Democratizing Reinforcement Learning for LLMs☆2,113Updated last month
- Witness the aha moment of VLM with less than $3.☆3,376Updated 3 weeks ago
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆16,515Updated last week
- Train transformer language models with reinforcement learning.☆12,890Updated this week
- ☆4,070Updated 9 months ago
- Fast and memory-efficient exact attention☆16,587Updated this week
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)☆45,117Updated this week
- An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)☆5,919Updated this week
- An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl☆5,145Updated last month
- Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.☆16,356Updated 2 weeks ago
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆23,891Updated this week
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆14,067Updated this week
- PyTorch native post-training library☆5,026Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆11,878Updated this week
- llama3 implementation one matrix multiplication at a time☆14,675Updated 10 months ago
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆4,201Updated 2 months ago
- DeepSeek Coder: Let the Code Write Itself☆21,179Updated 10 months ago
- Fully open data curation for reasoning models☆1,576Updated last week
- The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬☆10,370Updated this week