Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
☆172Jul 31, 2024Updated last year
Alternatives and similar repositories for GPT-2
Users that are interested in GPT-2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Jan 26, 2025Updated last year
- From the Tensor to Stable Diffusion, a rough outline for a 10 week course.☆1,082Apr 5, 2026Updated 2 months ago
- High Quality Resources on GPU Programming/Architecture☆593Jul 26, 2024Updated last year
- could we make an ml stack in 100,000 lines of code?☆46Jul 17, 2024Updated last year
- Neural Networks from scratch in Go.☆21Jul 7, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Simple Transformer in Jax☆144Jun 22, 2024Updated 2 years ago
- Genome analysis toolkit☆12Apr 23, 2025Updated last year
- Retrieve the source code for any model made available on replicate.com!☆36Jan 22, 2024Updated 2 years ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Jul 8, 2024Updated last year
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Nov 4, 2024Updated last year
- ☆13Aug 10, 2024Updated last year
- A browser extension that demos Gemini Nano via window.ai and Cartesia TTS ⚡️☆38Jul 10, 2024Updated last year
- My code/notebook's following Karpathy's legendary deep learning course: https://www.youtube.com/@AndrejKarpathy☆23Jul 6, 2024Updated last year
- ☆16Feb 18, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Best-of-N LLM editing with auto version control (+ other unix tools)☆39Apr 22, 2025Updated last year
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆228Jan 2, 2025Updated last year
- ☆11May 18, 2025Updated last year
- Assignments of courses taught at IISC as part of MTech AI curriculum☆144Feb 15, 2025Updated last year
- ☆16Feb 24, 2026Updated 4 months ago
- Recreating gpt-2 from scratch☆26Jul 6, 2024Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- ☆27Jul 9, 2024Updated last year
- Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"☆11Mar 31, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- NSA Triton Kernels written with GPT5 and Opus 4.1☆70Aug 12, 2025Updated 10 months ago
- gpt-2 from scratch in mlx☆434Jun 12, 2024Updated 2 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆19Feb 7, 2023Updated 3 years ago
- Entropy Based Sampling and Parallel CoT Decoding☆3,433Nov 13, 2024Updated last year
- Just a bunch of benchmark logs for different LLMs☆130Jul 28, 2024Updated last year
- pipeline to auto (scrape => clean => analyze => chat with) tons of data☆41Jun 5, 2024Updated 2 years ago
- A deep-dive on the entire history of deep-learning☆1,558Jul 16, 2024Updated last year
- Source code for the Joint Shapley values: a measure of joint feature importance☆12Sep 14, 2021Updated 4 years ago
- Educational WIP☆72Feb 16, 2026Updated 4 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- llama3 implementation one matrix multiplication at a time☆15,225May 23, 2024Updated 2 years ago
- Learnings and programs related to CUDA☆438Jun 29, 2025Updated last year
- A categorised list of Multi-Agent Reinforcemnt Learning (MARL) papers☆59Jan 20, 2023Updated 3 years ago
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- ☆39Feb 27, 2025Updated last year
- NanoGPT (124M) in 90 seconds☆5,438Jun 21, 2026Updated last week
- LLM101n: Let's build a Storyteller☆37,383Aug 1, 2024Updated last year