Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
☆171Jul 31, 2024Updated last year
Alternatives and similar repositories for GPT-2
Users that are interested in GPT-2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16Jan 26, 2025Updated last year
- Rust Implementation of micrograd☆52Jul 3, 2024Updated last year
- High Quality Resources on GPU Programming/Architecture☆593Jul 26, 2024Updated last year
- In this repository I have a code and brief explanations of the attempts that I made at the ARC-AGI (2024) challenges :)☆26Nov 11, 2024Updated last year
- could we make an ml stack in 100,000 lines of code?☆46Jul 17, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- i will automate factorio☆113Jul 31, 2024Updated last year
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆84May 22, 2025Updated 10 months ago
- Simple Transformer in Jax☆143Jun 22, 2024Updated last year
- Genome analysis toolkit☆12Apr 23, 2025Updated 11 months ago
- Fast semantic search for biorXiv manuscripts☆12Feb 16, 2025Updated last year
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Jul 8, 2024Updated last year
- learningggggggg 🐳☆615Apr 2, 2025Updated 11 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Nov 4, 2024Updated last year
- A browser extension that demos Gemini Nano via window.ai and Cartesia TTS ⚡️☆38Jul 10, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆16Feb 18, 2024Updated 2 years ago
- My code/notebook's following Karpathy's legendary deep learning course: https://www.youtube.com/@AndrejKarpathy☆22Jul 6, 2024Updated last year
- Best-of-N LLM editing with auto version control (+ other unix tools)☆39Apr 22, 2025Updated 11 months ago
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆229Jan 2, 2025Updated last year
- ☆11May 18, 2025Updated 10 months ago
- ☆15Feb 24, 2026Updated last month
- Implementation of "Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions"☆24Aug 27, 2024Updated last year
- Assignments of courses taught at IISC as part of MTech AI curriculum☆141Feb 15, 2025Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- ☆27Jul 9, 2024Updated last year
- Open Source HTTP Requests Inspection Platform☆14Aug 3, 2024Updated last year
- Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"☆11Mar 31, 2024Updated last year
- NSA Triton Kernels written with GPT5 and Opus 4.1☆71Aug 12, 2025Updated 7 months ago
- An HTTP server written from scratch in C.☆56Jun 22, 2024Updated last year
- gpt-2 from scratch in mlx☆418Jun 12, 2024Updated last year
- Training framework with a goal to explore the frontier of sample efficiency of small language models☆99Jan 25, 2026Updated 2 months ago
- Entropy Based Sampling and Parallel CoT Decoding☆3,434Nov 13, 2024Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆21Feb 7, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Just a bunch of benchmark logs for different LLMs☆120Jul 28, 2024Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆111Mar 7, 2025Updated last year
- Educational WIP☆71Feb 16, 2026Updated last month
- A deep-dive on the entire history of deep-learning☆1,545Jul 16, 2024Updated last year
- pipeline to auto (scrape => clean => analyze => chat with) tons of data☆41Jun 5, 2024Updated last year
- llama3 implementation one matrix multiplication at a time☆15,255May 23, 2024Updated last year
- Learnings and programs related to CUDA☆435Jun 29, 2025Updated 9 months ago