nivibilla / build-nanogptLinks

Video+code lecture on building nanoGPT from scratch

☆68

Alternatives and similar repositories for build-nanogpt

Users that are interested in build-nanogpt are comparing it to the libraries listed below

Sorting:

teknium1 / ShareGPT-Builder
☆117Updated 11 months ago
mzbac / mlx-moe
Scripts to create your own moe models using mlx
☆90Updated last year
QuixiAI / grokadamw
☆136Updated last year
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
tensoic / Cerule
Cerule - A Tiny Mighty Vision Model
☆68Updated 3 weeks ago
ritabratamaiti / AnyModal
AnyModal is a Flexible Multimodal Language Model Framework for PyTorch
☆103Updated 11 months ago
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆146Updated 9 months ago
AlexBodner / How_Much_VRAM
☆101Updated last year
EduardTalianu / EntropixLab
entropix style sampling + GUI
☆27Updated last year
bdambrosio / AllTheWorldAPlay
All the world is a play, we are but actors in it.
☆50Updated 4 months ago
QuixiAI / kraken
☆67Updated last year
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆108Updated 8 months ago
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆119Updated last year
Vaibhavs10 / notebooks
☆127Updated 8 months ago
NousResearch / Obsidian
Maybe the new state of the art vision model? we'll see 🤷‍♂️
☆167Updated last year
adithya-s-k / YoloGemma
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…
☆84Updated last year
tdrussell / qlora-pipe
A pipeline parallel training script for LLMs.
☆163Updated 7 months ago
impel-intelligence / dippy-bittensor-subnet
☆55Updated 2 months ago
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 7 months ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆99Updated 6 months ago
agokrani / distillKitPlus
Easy to use, High Performant Knowledge Distillation for LLMs
☆96Updated 6 months ago
vikhyat / mixtral-inference
inference code for mixtral-8x7b-32kseqlen
☆103Updated last year
fairydreaming / farel-bench
Testing LLM reasoning abilities with family relationship quizzes.
☆63Updated 10 months ago
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆179Updated last year
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆232Updated last year
julien-blanchon / arxflix
Arxflix turns your boring Arxiv research paper into a captivating video.
☆55Updated 2 months ago
N8python / mlx-pretrain
A simple MLX implementation for pretraining LLMs on Apple Silicon.
☆84Updated 3 months ago
sshh12 / multi_token
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
☆187Updated last year
austinsilveria / tricksy
Fast approximate inference on a single GPU with sparsity aware offloading
☆39Updated last year
thooton / muse
Let's create synthetic textbooks together :)
☆75Updated last year