Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
☆171Jul 31, 2024Updated last year
Alternatives and similar repositories for GPT-2
Users that are interested in GPT-2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Rust Implementation of micrograd☆52Jul 3, 2024Updated last year
- From the Tensor to Stable Diffusion, a rough outline for a 10 week course.☆1,082Apr 5, 2026Updated 2 months ago
- High Quality Resources on GPU Programming/Architecture☆592Jul 26, 2024Updated last year
- In this repository I have a code and brief explanations of the attempts that I made at the ARC-AGI (2024) challenges :)☆26Nov 11, 2024Updated last year
- Neural Networks from scratch in Go.☆21Jul 7, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Tensor library with autograd using only Rust's standard library☆73Jul 1, 2024Updated last year
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆85May 22, 2025Updated last year
- Simple Transformer in Jax☆144Jun 22, 2024Updated last year
- Building LLM apps with Text Tensors using PyTorch concepts and text gradients☆38Sep 4, 2024Updated last year
- Retrieve the source code for any model made available on replicate.com!☆36Jan 22, 2024Updated 2 years ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Jul 8, 2024Updated last year
- learningggggggg 🐳☆618Apr 2, 2025Updated last year
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Nov 4, 2024Updated last year
- A browser extension that demos Gemini Nano via window.ai and Cartesia TTS ⚡️☆38Jul 10, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- My code/notebook's following Karpathy's legendary deep learning course: https://www.youtube.com/@AndrejKarpathy☆23Jul 6, 2024Updated last year
- Best-of-N LLM editing with auto version control (+ other unix tools)☆39Apr 22, 2025Updated last year
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆228Jan 2, 2025Updated last year
- ☆11May 18, 2025Updated last year
- Implementation of "Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions"☆24Aug 27, 2024Updated last year
- ☆24Dec 26, 2023Updated 2 years ago
- Assignments of courses taught at IISC as part of MTech AI curriculum☆143Feb 15, 2025Updated last year
- Recreating gpt-2 from scratch☆26Jul 6, 2024Updated last year
- ☆27Jul 9, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"☆11Mar 31, 2024Updated 2 years ago
- NSA Triton Kernels written with GPT5 and Opus 4.1☆70Aug 12, 2025Updated 9 months ago
- An HTTP server written from scratch in C.☆58Jun 22, 2024Updated last year
- gpt-2 from scratch in mlx☆429Jun 12, 2024Updated last year
- Training framework with a goal to explore the frontier of sample efficiency of small language models☆100Jan 25, 2026Updated 4 months ago
- LLM training in simple, raw C/CUDA☆15Dec 5, 2024Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆20Feb 7, 2023Updated 3 years ago
- Entropy Based Sampling and Parallel CoT Decoding☆3,435Nov 13, 2024Updated last year
- Just a bunch of benchmark logs for different LLMs☆127Jul 28, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆111Mar 7, 2025Updated last year
- pipeline to auto (scrape => clean => analyze => chat with) tons of data☆41Jun 5, 2024Updated 2 years ago
- A deep-dive on the entire history of deep-learning☆1,554Jul 16, 2024Updated last year
- llama3 implementation one matrix multiplication at a time☆15,231May 23, 2024Updated 2 years ago
- Learnings and programs related to CUDA☆437Jun 29, 2025Updated 11 months ago
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- From the Transistor to the Web Browser, a rough outline for a 12 week course☆6,515Oct 12, 2021Updated 4 years ago