Deepseek R1 zero tiny version own reproduce on two A100s.
☆84Feb 1, 2025Updated last year
Alternatives and similar repositories for TinyZero
Users that are interested in TinyZero are comparing it to the libraries listed below
Sorting:
- Official implementation of SIGIR 2022 Paper "Task-Oriented Dialogue System as Natural Language Generation".☆14Apr 6, 2022Updated 3 years ago
- ☆37Feb 4, 2026Updated last month
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆42Mar 11, 2026Updated last week
- Biomedical Event Extraction exhibiting first industry-level performances in quality and speed☆16Dec 14, 2022Updated 3 years ago
- ☆10Feb 2, 2023Updated 3 years ago
- Your efficient and accurate answer verification system for RL training.☆41Jun 23, 2025Updated 8 months ago
- [ICLR 2026] Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding☆31Jan 27, 2026Updated last month
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆60Feb 6, 2026Updated last month
- ☆25May 30, 2023Updated 2 years ago
- Functional matrix factorization via Bayesian tensor filtering☆13Oct 1, 2025Updated 5 months ago
- Official Code For EMNLP2025 Findings: {DLPO : Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Le…☆10Dec 25, 2025Updated 2 months ago
- Reproduce R1 Zero on Logic Puzzle☆2,441Mar 20, 2025Updated last year
- Code and data for the Nature Machine Intelligence paper "Knowledge graph-enhanced molecular contrastive learning with functional prompt".☆10May 16, 2023Updated 2 years ago
- ☆29Jul 4, 2025Updated 8 months ago
- ☆11Oct 29, 2024Updated last year
- ☆12Feb 15, 2023Updated 3 years ago
- ☆13Dec 7, 2022Updated 3 years ago
- [AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆94Nov 8, 2025Updated 4 months ago
- gflsegpy: A Python 3 implementation of the group fused Lasso for multiple change-point detection (Bleakley and Vert, 2011)☆14Jul 14, 2018Updated 7 years ago
- Official repository for "Investigating Pre-Training Objectives for Generalization in Visual Reinforcement Learning" (ICML 2024)☆11Sep 16, 2025Updated 6 months ago
- ☆11Nov 22, 2019Updated 6 years ago
- DiscoverPath, a KG-based re- trieval system designed for biomedical research. This system aims to assist biomedical researchers in dynami…☆28Oct 25, 2023Updated 2 years ago
- ☆22Feb 3, 2024Updated 2 years ago
- Decoding Tree Sketching (DTS): a training-free & model agonistic & plug-in framework for LLM parallel reasoning.☆67Mar 8, 2026Updated last week
- Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.☆49Sep 15, 2025Updated 6 months ago
- SigFormer: Signature Transformer for Deep Hedging (ICAIF 2023)☆19Oct 23, 2023Updated 2 years ago
- A spoken version of the textual story cloze benchmark☆20Aug 6, 2023Updated 2 years ago
- ☆10Oct 11, 2022Updated 3 years ago
- ☆20Mar 26, 2025Updated 11 months ago
- Code and data for Cell-o1.☆26Sep 19, 2025Updated 6 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆54Dec 13, 2025Updated 3 months ago
- A simple implementation of ReasonGenRM.☆19Apr 21, 2025Updated 10 months ago
- ☆11Nov 8, 2023Updated 2 years ago
- [ICLR 2026]🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, mul…☆201Dec 10, 2025Updated 3 months ago
- ☆332May 31, 2025Updated 9 months ago
- PyTorch implementation of Count-Based Exploration with Neural Density Models☆10Mar 22, 2018Updated 7 years ago
- Pretraining summarization models using a corpus of nonsense☆13Sep 28, 2021Updated 4 years ago
- Code and data for "Medical Dialogue Generation via Dual Flow Modeling" (ACL 2023 Findings)☆13Nov 22, 2023Updated 2 years ago
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆86Nov 2, 2025Updated 4 months ago