Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
☆171Jul 31, 2024Updated last year
Alternatives and similar repositories for GPT-2
Users that are interested in GPT-2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Jan 26, 2025Updated last year
- Rust Implementation of micrograd☆52Jul 3, 2024Updated last year
- From the Tensor to Stable Diffusion, a rough outline for a 10 week course.☆1,075Apr 5, 2026Updated 2 weeks ago
- High Quality Resources on GPU Programming/Architecture☆592Jul 26, 2024Updated last year
- In this repository I have a code and brief explanations of the attempts that I made at the ARC-AGI (2024) challenges :)☆26Nov 11, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- i will automate factorio☆113Jul 31, 2024Updated last year
- Simple Transformer in Jax☆143Jun 22, 2024Updated last year
- Retrieve the source code for any model made available on replicate.com!☆36Jan 22, 2024Updated 2 years ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Jul 8, 2024Updated last year
- learningggggggg 🐳☆616Apr 2, 2025Updated last year
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Nov 4, 2024Updated last year
- A browser extension that demos Gemini Nano via window.ai and Cartesia TTS ⚡️☆38Jul 10, 2024Updated last year
- My code/notebook's following Karpathy's legendary deep learning course: https://www.youtube.com/@AndrejKarpathy☆22Jul 6, 2024Updated last year
- ☆16Feb 18, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Best-of-N LLM editing with auto version control (+ other unix tools)☆39Apr 22, 2025Updated 11 months ago
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆228Jan 2, 2025Updated last year
- Implementation of "Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions"☆24Aug 27, 2024Updated last year
- Assignments of courses taught at IISC as part of MTech AI curriculum☆141Feb 15, 2025Updated last year
- ☆24Dec 26, 2023Updated 2 years ago
- Recreating gpt-2 from scratch☆26Jul 6, 2024Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- ☆27Jul 9, 2024Updated last year
- Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"☆11Mar 31, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- gpt-2 from scratch in mlx☆423Jun 12, 2024Updated last year
- Training framework with a goal to explore the frontier of sample efficiency of small language models☆99Jan 25, 2026Updated 2 months ago
- LLM training in simple, raw C/CUDA☆15Dec 5, 2024Updated last year
- Julia workshop for undergrad physicists☆22Mar 18, 2021Updated 5 years ago
- Entropy Based Sampling and Parallel CoT Decoding☆3,431Nov 13, 2024Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆21Feb 7, 2023Updated 3 years ago
- Just a bunch of benchmark logs for different LLMs☆121Jul 28, 2024Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆111Mar 7, 2025Updated last year
- pipeline to auto (scrape => clean => analyze => chat with) tons of data☆41Jun 5, 2024Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A deep-dive on the entire history of deep-learning☆1,547Jul 16, 2024Updated last year
- ☆20Oct 25, 2025Updated 5 months ago
- Source code for the Joint Shapley values: a measure of joint feature importance☆12Sep 14, 2021Updated 4 years ago
- llama3 implementation one matrix multiplication at a time☆15,241May 23, 2024Updated last year
- Learnings and programs related to CUDA☆437Jun 29, 2025Updated 9 months ago
- Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation☆14Jan 2, 2026Updated 3 months ago
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year