karpathy / rustbpeLinks
The missing tiktoken training code
☆317Updated 3 weeks ago
Alternatives and similar repositories for rustbpe
Users that are interested in rustbpe are comparing it to the libraries listed below
Sorting:
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆333Updated 2 months ago
- SIMD quantization kernels☆94Updated 4 months ago
- Where GPUs get cooked 👩🍳🔥☆357Updated last week
- 👷 Build compute kernels☆214Updated last week
- Load compute kernels from the Hub☆381Updated this week
- Quantized LLM training in pure CUDA/C++.☆233Updated last week
- Simple Byte pair Encoding mechanism used for tokenization process . written purely in C☆144Updated last year
- Dion optimizer algorithm☆420Updated 2 weeks ago
- Async RL Training at Scale☆1,020Updated last week
- Simple & Scalable Pretraining for Neural Architecture Research☆307Updated last month
- Simple MPI implementation for prototyping or learning☆299Updated 5 months ago
- MoE training for Me and You and maybe other people☆331Updated 3 weeks ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆195Updated 7 months ago
- Fast bare-bones BPE for modern tokenizer training☆174Updated 7 months ago
- ☆214Updated last week
- An extension of the nanoGPT repository for training small MOE models.☆231Updated 10 months ago
- A lightweight, local-first, and 🆓 experiment tracking library from Hugging Face 🤗☆1,234Updated last week
- ☆540Updated 5 months ago
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆81Updated 8 months ago
- ☆952Updated 2 months ago
- A character-level language diffusion model trained on Tiny Shakespeare☆842Updated 2 weeks ago
- Implementation of Diffusion Transformer (DiT) in JAX☆305Updated last year
- Open-source release accompanying Gao et al. 2025☆498Updated last month
- Learn GPU Programming in Mojo🔥 by Solving Puzzles☆279Updated last week
- ☆229Updated 2 months ago
- (WIP) A small but powerful, homemade PyTorch from scratch.☆672Updated this week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆843Updated last week
- PyTorch building blocks for the OLMo ecosystem☆741Updated this week
- CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning☆285Updated 2 months ago
- noise_step: Training in 1.58b With No Gradient Memory☆220Updated last year