Antimatter543 / karpathy-NN-lectures
My runthrough of karpathy's lectures (with notes), building NN's from scratch, simple autoregressive language models, GPT models and learnt ML techniques.
☆10Updated last year
Related projects ⓘ
Alternatives and complementary repositories for karpathy-NN-lectures
- Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.☆44Updated last year
- ☆27Updated 4 months ago
- ☆20Updated 3 months ago
- LLM training in simple, raw C/CUDA☆17Updated 6 months ago
- Video+code lecture on building nanoGPT from scratch☆64Updated 5 months ago
- Tensor library with autograd using only Rust's standard library☆62Updated 4 months ago
- MLX Transformers is a library that provides model implementation in MLX. It uses a similar model interface as HuggingFace Transformers an…☆52Updated this week
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆29Updated last year
- Build Agentic workflows with function calling☆20Updated this week
- This repo is my attempt at a rough implementation of nanoGPT trained on a dataset of 30,000 unique Twitter usernames☆26Updated 7 months ago
- A really tiny autograd engine☆87Updated 7 months ago
- Simple Transformer in Jax☆119Updated 5 months ago
- ☆99Updated 7 months ago
- ☆57Updated 11 months ago
- minimal diffusion transformer in pytorch.☆15Updated last month
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆21Updated 5 months ago
- Sparse autoencoders for Contra text embedding models☆24Updated 6 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- KMD is a collection of conversational exchanges between patients and doctors on various medical topics. It aims to capture the intricaci…☆23Updated last year
- Collection of autoregressive model implementation☆67Updated this week
- An introduction to LLM Sampling☆64Updated 2 weeks ago
- ☆29Updated 5 months ago
- Cerule - A Tiny Mighty Vision Model☆67Updated 2 months ago
- inference code for mixtral-8x7b-32kseqlen☆98Updated 11 months ago
- An AI character interaction system with emotional modeling and advanced memory management☆15Updated 3 weeks ago
- A single notebook for fine-tuning GPT-3.5 turbo☆31Updated 3 months ago
- Full finetuning of large language models without large memory requirements☆93Updated 10 months ago
- A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks☆33Updated 5 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆80Updated 11 months ago
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆57Updated 4 months ago