Mayankpratapsingh022/DeepSeek-from-Scratch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Mayankpratapsingh022/DeepSeek-from-Scratch)

Mayankpratapsingh022 / DeepSeek-from-Scratch

☆118

Alternatives and similar repositories for DeepSeek-from-Scratch

Users that are interested in DeepSeek-from-Scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

VizuaraAILabs / DeepSeek-From-Scratch
View on GitHub
Learn the building blocks of how to build DeepSeek from scratch.
☆147May 9, 2026Updated 2 months ago
VizuaraAILabs / nano-gpt-oss
View on GitHub
Learn the building blocks of how to build gpt-oss from scratch
☆120Sep 23, 2025Updated 10 months ago
BioTender-max / ai-bio-conference-papers
View on GitHub
3,722 AI × Biology papers from ICLR / ICML / NeurIPS (2010–2026) — browsable by venue & year
☆17Jul 7, 2026Updated 2 weeks ago
VizuaraAILabs / truly-open-gpt-oss
View on GitHub
A truly open version of gpt-oss which shows the entire pre-training from scratch
☆90Sep 4, 2025Updated 10 months ago
wajihullahbaig / deepseekv3-minimal
View on GitHub
Creating the DeepSeek V3 model from scratch
☆28Mar 28, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
amazon-science / aws-research-science
View on GitHub
☆25Jul 17, 2026Updated last week
SakanaAI / TransEvalnia
View on GitHub
Reasoning-based Evaluation and Ranking of Translations.
☆21Jun 2, 2026Updated last month
gita / Datasets
View on GitHub
☆23Feb 24, 2023Updated 3 years ago
HarleyCoops / smolThinker-.5B
View on GitHub
A Qwen .5B reasoning model trained on OpenR1-Math-220k
☆14Updated this week
DebeshJha / PVTFormer
View on GitHub
Liver segmentation using Deep Learning on LiTS 2017 Dataset
☆23Apr 20, 2024Updated 2 years ago
srihari-humbarwadi / deep_metric_learning_tf2.0
View on GitHub
A tensorflow2.0 implementation of triplet loss with online hard mining strategy
☆14Jul 6, 2019Updated 7 years ago
databricks-industry-solutions / graphrag-demo
View on GitHub
☆28Apr 25, 2025Updated last year
omkaark / spotty
View on GitHub
Simple orchestration for EC2 spot containers
☆19Sep 27, 2024Updated last year
tilde-research / momoe-release
View on GitHub
Memory optimized Mixture of Experts
☆80Jul 25, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
sam9111 / pixtale
View on GitHub
☆11Jan 24, 2025Updated last year
rasbt / reasoning-from-scratch
View on GitHub
Implement a reasoning LLM in PyTorch from scratch, step by step
☆4,810Jul 6, 2026Updated 2 weeks ago
FlorinAndrei / misc
View on GitHub
a catch-all repo
☆11Dec 28, 2023Updated 2 years ago
emersON106 / SGPPI
View on GitHub
SGPPI: structure-aware prediction of protein-protein interactions in rigorous conditions with graph convolutional network
☆15Nov 17, 2022Updated 3 years ago
facebookresearch / nccl
View on GitHub
Optimized primitives for collective multi-GPU communication
☆25Apr 17, 2024Updated 2 years ago
gordicaleksa / OpenGemini
View on GitHub
Effort to open-source 10.5 trillion parameter Gemini model.
☆17Dec 6, 2023Updated 2 years ago
junfanz1 / MoE-Mixture-of-Experts-in-PyTorch
View on GitHub
Implementations of a Mixture-of-Experts (MoE) architecture designed for research on large language models (LLMs) and scalable neural netw…
☆77Apr 8, 2025Updated last year
tilde-research / nitrobrew-release
View on GitHub
Fused KL divergence from hidden states for knowledge distillation
☆19Apr 28, 2026Updated 2 months ago
DebeshJha / MDNet
View on GitHub
Abdominal Organ Segmentation using Multi Decoder Network (MDNet) [Accepted at ICASSP 2025]
☆13Apr 15, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zjlww / dsp
View on GitHub
Digital Speech Processing in PyTorch.
☆15Aug 12, 2022Updated 3 years ago
yaof20 / verl
View on GitHub
verl: Volcano Engine Reinforcement Learning for LLMs
☆22Nov 6, 2025Updated 8 months ago
ideaweaver-ai / qwen3-from-scratch
View on GitHub
☆16Jul 4, 2026Updated 3 weeks ago
Snektron / gpumode-amd-fp8-mm
View on GitHub
My submission for the GPUMODE/AMD fp8 mm challenge
☆29Jun 4, 2025Updated last year
Maharshi-Pandya / gpu-stuff
View on GitHub
Repository for GPU related kernels for learning/testing purposes
☆19May 27, 2026Updated last month
kyleliang919 / Super_Muon
View on GitHub
☆68Mar 21, 2025Updated last year
ideaweaver-ai / Tiny-Children-Stories-30M-model
View on GitHub
☆131Jun 17, 2025Updated last year
lalalune / gptcoder
View on GitHub
RAG Agent for the ARC AGI Challenge
☆20Jul 1, 2024Updated 2 years ago
ThomasRochefortB / torch-gato
View on GitHub
Pytorch implementation of the Gato paper from Deepmind
☆12Feb 8, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yamato0811 / streamlit-langgraph-HITL-copy-generator
View on GitHub
StreamlitとLangGraphで実装したHuman-in-the-loop広告コピー文生成アプリケーション
☆11Feb 15, 2025Updated last year
s-smits / grpo-optuna
View on GitHub
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆60Oct 18, 2025Updated 9 months ago
Sea-Snell / JAXSeq
View on GitHub
Train very large language models in Jax.
☆208Oct 21, 2023Updated 2 years ago
divyamakkar0 / JAXformer
View on GitHub
A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.
☆127Dec 29, 2025Updated 6 months ago
Emericen / tiny-qwen
View on GitHub
A minimal PyTorch re-implementation of Qwen 3.5
☆430Jun 15, 2026Updated last month
DebeshJha / DoubleUNet
View on GitHub
PyTorch implementation of DoubleUNet for medical image segmentation
☆15Apr 6, 2026Updated 3 months ago
fL0n9 / SKFAC-MindSpore
View on GitHub
SKFAC Preconditioner for MindSpore
☆12Jul 2, 2021Updated 5 years ago