nivibilla / build-nanogpt
Video+code lecture on building nanoGPT from scratch
☆66Updated 10 months ago
Alternatives and similar repositories for build-nanogpt:
Users that are interested in build-nanogpt are comparing it to the libraries listed below
- ☆112Updated 4 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆98Updated last month
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆42Updated 11 months ago
- Cerule - A Tiny Mighty Vision Model☆67Updated 8 months ago
- ☆129Updated 8 months ago
- Collection of autoregressive model implementation☆85Updated last week
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆231Updated 6 months ago
- Scripts to create your own moe models using mlx☆89Updated last year
- Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…☆41Updated 2 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆139Updated 2 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆65Updated last week
- Simple GRPO scripts and configurations.☆58Updated 3 months ago
- All the world is a play, we are but actors in it.☆49Updated this week
- ☆66Updated 11 months ago
- Full finetuning of large language models without large memory requirements☆94Updated last year
- Lego for GRPO☆27Updated last month
- Set of scripts to finetune LLMs☆37Updated last year
- ☆28Updated last year
- A simple, hackable text-to-speech system in PyTorch and MLX☆154Updated 2 months ago
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆73Updated this week
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆93Updated 4 months ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated last year
- look how they massacred my boy☆63Updated 6 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆64Updated 6 months ago
- Tokun to can tokens☆17Updated this week
- A pipeline parallel training script for LLMs.☆139Updated this week
- Maybe the new state of the art vision model? we'll see 🤷♂️☆163Updated last year
- a simplified version of Google's Gemma model to be used for learning☆24Updated last year
- entropix style sampling + GUI☆26Updated 6 months ago