Reproduction of the complete process of DeepSeek-R1 on small-scale models, including Pre-training, SFT, and RL.
☆29Mar 11, 2025Updated last year
Alternatives and similar repositories for TinyDeepSeek
Users that are interested in TinyDeepSeek are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL 2025] Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging☆40Jun 4, 2025Updated 11 months ago
- MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.☆33Apr 18, 2026Updated 2 weeks ago
- Encourage Medical LLM to engage in deep thinking similar to DeepSeek-R1.☆26Apr 24, 2025Updated last year
- 📖收集国内外深度学习大模型API、论文、案例与学习资料,欢迎Star🌟☆31May 12, 2022Updated 3 years ago
- PaddleClas ShiTu Image Manager PP-ShiTu 库管理工具☆18Jan 30, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆19Oct 30, 2025Updated 6 months ago
- Irene is a python package that aims to be a toolkit for global optimization problems that can be realized algebraically. It generalizes L…☆15Updated this week
- This is a MicroPython library for the Raspberry Pi Pico and the BH1750 digital Ambient Light Sensor.☆20Oct 8, 2022Updated 3 years ago
- Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"☆102Jul 9, 2024Updated last year
- Towards Fine-grained Audio Captioning with Multimodal Contextual Cues☆87Jan 4, 2026Updated 4 months ago
- ☆11May 27, 2021Updated 4 years ago
- Official Code for All-in-One Medical Image Re-Identification (CVPR2025)☆20Jan 11, 2026Updated 3 months ago
- SimKO: Simple Pass@K Policy Optimization☆31Oct 24, 2025Updated 6 months ago
- ThetaEvolve: Test-time Learning on Open Problems, enabling RL training on AlphaEvolve/OpenEvolve and emphasizing scaling test-time comput…☆147Feb 27, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)☆15Feb 1, 2025Updated last year
- [ISBI 2024] Semi-supervised Medical Image Segmentation Method Based on Cross-pseudo Labeling Leveraging Strong and Weak Data Augmentation…☆16Feb 23, 2025Updated last year
- Odysseus: Playground of LLM Sequence Parallelism☆78Jun 17, 2024Updated last year
- A Comprehensive Rare Disease Diagnostic Dataset with nearly 50,000 patients covering more than 4000 diseases☆42Mar 13, 2026Updated last month
- [TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆148Oct 10, 2025Updated 6 months ago
- ☆30Aug 21, 2025Updated 8 months ago
- ☆12Jan 21, 2025Updated last year
- [TPAMI 2023] Object Affinity Learning: Towards Annotation-free Instance Segmentation☆14Sep 14, 2023Updated 2 years ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 4 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆13Oct 13, 2025Updated 6 months ago
- [NeurIPS2023] "Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced Transfer Learning" by Yihua Zhang*, Yimeng Zhang*,…☆14Oct 12, 2023Updated 2 years ago
- Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.☆22Jul 18, 2025Updated 9 months ago
- 强化学习课程,主要是如何用强化学习解决问题☆15Dec 10, 2024Updated last year
- [EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization☆23Apr 13, 2026Updated 3 weeks ago
- Using PyTorch autograd to compute Hessian of Perplexity for Large Language Models☆29Apr 17, 2025Updated last year
- Official resources of "The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reaso…☆21Jun 13, 2025Updated 10 months ago
- Multilingual Medicine: Model, Dataset, Benchmark, Code☆200Oct 15, 2024Updated last year
- SuperCLUE高考作文机器自动阅卷系统☆19Jun 8, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆16Apr 7, 2024Updated 2 years ago
- A simple implementation of ReasonGenRM.☆19Apr 21, 2025Updated last year
- This repo contains the code for the paper "Understanding and Mitigating Hallucinations in Large Vision-Language Models via Modular Attrib…☆37Jul 14, 2025Updated 9 months ago
- Yet another LLM☆10Apr 6, 2023Updated 3 years ago
- Code for PromptNet☆16Jan 29, 2025Updated last year
- [ACL 2024] Predicting the Unpredictable: Uncertainty-Aware Reasoning over Temporal Knowledge Graphs via Diffusion Process☆21Oct 7, 2024Updated last year
- ☆20Jan 6, 2023Updated 3 years ago