Jiayi-Pan / TinyZeroLinks
Minimal reproduction of DeepSeek R1-Zero
☆11,811Updated last month
Alternatives and similar repositories for TinyZero
Users that are interested in TinyZero are comparing it to the libraries listed below
Sorting:
- verl: Volcano Engine Reinforcement Learning for LLMs☆8,593Updated this week
- Fully open reproduction of DeepSeek-R1☆24,541Updated this week
- s1: Simple test-time scaling☆6,394Updated last week
- Simple RL training for reasoning☆3,584Updated last month
- Democratizing Reinforcement Learning for LLMs☆3,291Updated 2 weeks ago
- Sky-T1: Train your own O1 preview model within $450☆3,254Updated last week
- SGLang is a fast serving framework for large language models and vision language models.☆14,667Updated this week
- ☆3,342Updated 2 months ago
- Witness the aha moment of VLM with less than $3.☆3,688Updated last week
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆10,709Updated 2 weeks ago
- Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥☆39,558Updated this week
- Janus-Series: Unified Multimodal Understanding and Generation Models☆17,306Updated 3 months ago
- Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.☆19,824Updated 2 months ago
- An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Asy…☆6,880Updated this week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆17,355Updated this week
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆21,673Updated last week
- Go ahead and axolotl questions☆9,470Updated this week
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆2,715Updated 2 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆48,531Updated this week
- Easily fine-tune, evaluate and deploy Qwen3, DeepSeek-R1, Llama 4 or any open source LLM / VLM!☆8,119Updated last week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆1,876Updated this week
- DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding☆4,854Updated 3 months ago
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,372Updated last month
- A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.☆6,740Updated this week
- Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation☆7,779Updated 2 weeks ago
- AllenAI's post-training codebase☆2,986Updated this week
- A live stream development of RL tunning for LLM agents☆2,838Updated last week
- Fully open data curation for reasoning models☆1,793Updated last week
- Train transformer language models with reinforcement learning.☆13,971Updated this week
- The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.☆18,358Updated last month