VatsaDev/NanoPoor

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/VatsaDev/NanoPoor)

VatsaDev / NanoPoor

NanoGPT-speedrunning for the poor T4 enjoyers

☆72

Alternatives and similar repositories for NanoPoor

Users that are interested in NanoPoor are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Birch-san / booru-embed
View on GitHub
[WIP] Transformer to embed Danbooru labelsets
☆13Mar 31, 2024Updated 2 years ago
joey00072 / microjax
View on GitHub
Jax like function transformation engine but micro, microjax
☆34Oct 25, 2024Updated last year
gau-nernst / kokoro
View on GitHub
https://hf.co/hexgrad/Kokoro-82M
☆14Jan 14, 2026Updated 6 months ago
kmohan321 / Research_Papers
View on GitHub
☆45Mar 31, 2025Updated last year
joey00072 / ohara
View on GitHub
Collection of autoregressive model implementation
☆84Jun 10, 2026Updated last month
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ethansmith2000 / TransformerExperiments
View on GitHub
☆19Dec 4, 2025Updated 7 months ago
ChinmayK0607 / heiretsu
View on GitHub
Educational WIP
☆73Feb 16, 2026Updated 5 months ago
fal-ai-community / nano-mdm
View on GitHub
Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun
☆57Mar 10, 2025Updated last year
JoeLi12345 / nGPT
View on GitHub
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆112Mar 7, 2025Updated last year
Chillee / lit-llama
View on GitHub
Simple (fast) transformer inference in PyTorch with torch.compile + lit-llama code
☆10Aug 29, 2023Updated 2 years ago
ezyang / cute-interactive
View on GitHub
Interactive version of the CuTe layout paper
☆57Apr 14, 2026Updated 3 months ago
epfml / DenseFormer
View on GitHub
☆83Apr 16, 2024Updated 2 years ago
kyleliang919 / Super_Muon
View on GitHub
☆68Mar 21, 2025Updated last year
MaximeRivest / funnydspy
View on GitHub
Vanilla-Python ergonomics on top of DSPy
☆40Jun 3, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
edwardmilsom / function-space-learning-rates-paper
View on GitHub
Code for the paper "Function-Space Learning Rates"
☆23Jun 3, 2025Updated last year
joey00072 / Multi-Head-Latent-Attention-MLA-
View on GitHub
working implimention of deepseek MLA
☆44Jan 8, 2025Updated last year
minosvasilias / simple_grpo
View on GitHub
Simple GRPO scripts and configurations.
☆59Feb 6, 2025Updated last year
Noumena-Network / nmoe
View on GitHub
MoE training for Me and You and maybe other people
☆394Mar 15, 2026Updated 4 months ago
tensor-fusion / microhaskell
View on GitHub
Small autodiff lib and a simple working feedforward neural net in Haskell on top of it, from scratch, zero-deps.
☆16Jun 21, 2024Updated 2 years ago
Laz4rz / RL
View on GitHub
☆15Jan 26, 2025Updated last year
cloneofsimo / zeroshampoo
View on GitHub
☆33Sep 10, 2024Updated last year
druidowm / OccamLLM
View on GitHub
☆14Oct 21, 2024Updated last year
okarthikb / state-space-models
View on GitHub
☆27Jul 9, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
valine / training-hot-swap
View on GitHub
Pytorch script hot swap: Change code without unloading your LLM from VRAM
☆126Apr 21, 2025Updated last year
proger / nanokitchen
View on GitHub
Parallel Associative Scan for Language Models
☆18Jan 8, 2024Updated 2 years ago
sdan / nanoEBM
View on GitHub
minimal Energy-based transformer
☆44Dec 11, 2025Updated 7 months ago
saurabhaloneai / gpu-programming
View on GitHub
☆15Jan 22, 2026Updated 5 months ago
JoshuaPurtell / LRCBench
View on GitHub
Evals meant to evaluate language models' ability to reason over long contexts.
☆10Sep 12, 2024Updated last year
stockeh / mlx-grokking
View on GitHub
Grokking on modular arithmetic in less than 150 epochs in MLX
☆15Oct 24, 2024Updated last year
noah-hein / mazeGPT
View on GitHub
AI model for making mazes that extends OpenAIs GPT2 model
☆15Dec 21, 2023Updated 2 years ago
xjdr-alt / simple_transformer
View on GitHub
Simple Transformer in Jax
☆143Jun 22, 2024Updated 2 years ago
BlinkDL / SmallInitEmb
View on GitHub
LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence
☆61Feb 21, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
google-deepmind / nanodo
View on GitHub
☆304Jul 15, 2024Updated 2 years ago
google / drjax
View on GitHub
☆19Jul 8, 2026Updated last week
rejunity / tiny-asic-4bit-matrix-mul
View on GitHub
Tiny matrix multiplication ASIC with 4-bit math
☆12Apr 19, 2024Updated 2 years ago
fal-ai-community / NativeSparseAttention
View on GitHub
research impl of Native Sparse Attention (2502.11089)
☆62Feb 19, 2025Updated last year
shawntan / stickbreaking-attention
View on GitHub
Stick-breaking attention
☆63Jul 1, 2025Updated last year
zaydzuhri / softpick-attention
View on GitHub
Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"
☆91Sep 12, 2025Updated 10 months ago
MrYxJ / InfiniRetri
View on GitHub
☆52Feb 17, 2025Updated last year