ash-01xor/bpe.c

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ash-01xor/bpe.c)

ash-01xor / bpe.c

Simple Byte pair Encoding mechanism used for tokenization process . written purely in C

☆147

Alternatives and similar repositories for bpe.c

Users that are interested in bpe.c are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

clu0 / unet.cu
View on GitHub
UNet diffusion model in pure CUDA
☆656Jun 28, 2024Updated last year
gautierdag / bpeasy
View on GitHub
Fast bare-bones BPE for modern tokenizer training
☆176Jun 23, 2025Updated 9 months ago
kvfrans / jax-diffusion-transformer
View on GitHub
Implementation of Diffusion Transformer (DiT) in JAX
☆308Jun 11, 2024Updated last year
storborg / glass-teardown
View on GitHub
Teardown of Google Glass
☆39Jan 11, 2014Updated 12 years ago
normster / llm_rules
View on GitHub
RuLES: a benchmark for evaluating rule-following in language models
☆249Feb 24, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
bbabenko / simple_convnet
View on GitHub
A basic implementation of convolutional neural nets
☆59Apr 20, 2014Updated 11 years ago
BobMcDear / attorch
View on GitHub
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆598Aug 12, 2025Updated 7 months ago
Quentin-Anthony / nanoMPI
View on GitHub
Simple MPI implementation for prototyping or learning
☆305Aug 6, 2025Updated 7 months ago
Ramblurr / scriptbots
View on GitHub
ScriptBots is an Open Source Evolutionary Artificial Life Simulation of Predator-Prey dynamics, written by Andrej Karpathy.
☆63Feb 18, 2011Updated 15 years ago
AnswerDotAI / gpu.cpp
View on GitHub
A lightweight library for portable low-level GPU computation using WebGPU.
☆3,954Oct 8, 2025Updated 5 months ago
mcinglis / c-style
View on GitHub
My favorite C programming practices.
☆2,151Jan 19, 2026Updated 2 months ago
IoannisAntonoglou / optimBench
View on GitHub
Benchmark testbed for assessing the performance of optimisation algorithms
☆86Jan 7, 2015Updated 11 years ago
SonyResearch / micro_diffusion
View on GitHub
Official repository for our work on micro-budget training of large-scale diffusion models.
☆1,555Jan 12, 2025Updated last year
huggingface / picotron
View on GitHub
Minimalistic 4D-parallelism distributed training framework for education purpose
☆2,119Aug 26, 2025Updated 7 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
policy-gradient / GRPO-Zero
View on GitHub
Implementing DeepSeek R1's GRPO algorithm from scratch
☆1,795Apr 18, 2025Updated 11 months ago
changjonathanc / flex-nano-vllm
View on GitHub
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
☆337Nov 2, 2025Updated 4 months ago
isafulf / inbox_cleaner
View on GitHub
A python script to help manage a Gmail inbox by filtering out promotional emails using GPT-3 or GPT-4.
☆458Dec 2, 2023Updated 2 years ago
HazyResearch / ThunderKittens
View on GitHub
Tile primitives for speedy kernels
☆3,244Mar 17, 2026Updated last week
karpathy / scriptsbots
View on GitHub
ScriptBots is an Open Source Evolutionary Artificial Life Simulation of Predator-Prey dynamics, written by Andrej Karpathy.
☆164Jan 2, 2012Updated 14 years ago
wearscript / wearscript-android
View on GitHub
JavaScript with Batteries Included for Google Glass
☆218Jul 10, 2016Updated 9 years ago
pranavjad / mlx-gpt2
View on GitHub
gpt-2 from scratch in mlx
☆418Jun 12, 2024Updated last year
karpathy / notpygamejs
View on GitHub
Game making library for using Canvas element
☆95Oct 17, 2023Updated 2 years ago
Sohl-Dickstein / Sum-of-Functions-Optimizer
View on GitHub
Implements SFO minibatch optimizer in Python and MATLAB, and reproduces figures from paper.
☆134May 17, 2021Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
spacewalk01 / nanosam-cpp
View on GitHub
C++ TensorRT Implementation of NanoSAM
☆51Dec 28, 2023Updated 2 years ago
microsoft / Samba
View on GitHub
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆954Nov 16, 2025Updated 4 months ago
KellerJordan / modded-nanogpt
View on GitHub
NanoGPT (124M) in 2 minutes
☆5,003Mar 17, 2026Updated last week
facebookresearch / schedule_free
View on GitHub
Schedule-Free Optimization in PyTorch
☆2,265May 21, 2025Updated 10 months ago
ubermenchh / mini-vllm
View on GitHub
☆16Feb 25, 2026Updated last month
googleglass / mirror-quickstart-python
View on GitHub
Google Mirror API's Quickstart for Python
☆350Jun 13, 2021Updated 4 years ago
Jaykef / Triton-nanoGPT
View on GitHub
Custom triton kernels for training Karpathy's nanoGPT.
☆19Oct 21, 2024Updated last year
pytorch / torchchat
View on GitHub
Run PyTorch LLMs locally on servers, desktop and mobile
☆3,625Sep 10, 2025Updated 6 months ago
xinshengwang / robpitch
View on GitHub
A pitch detection model trained to be robust against noise and reverberation environments.
☆27Jan 21, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
pytorch / torchtitan
View on GitHub
A PyTorch native platform for training generative AI models
☆5,162Mar 20, 2026Updated last week
harthur / hog-descriptor
View on GitHub
[UNMAINTAINED] Histogram of Oriented Gradients (HOG) descriptor extractor
☆172Mar 1, 2015Updated 11 years ago
huggingface / nanoVLM
View on GitHub
The simplest, fastest repository for training/finetuning small-sized VLMs.
☆4,738Oct 27, 2025Updated 5 months ago
ridgerchu / matmulfreellm
View on GitHub
Implementation for MatMul-free LM.
☆3,059Dec 2, 2025Updated 3 months ago
primepake / learnable-speech
View on GitHub
This repo is text to speech with learnable audio encoder without alignment with transcript reference
☆54Sep 20, 2025Updated 6 months ago
chipsalliance / rocket
View on GitHub
The working draft to split rocket core out from rocket chip
☆14Dec 22, 2023Updated 2 years ago
Jiayi-Pan / TinyZero
View on GitHub
Minimal reproduction of DeepSeek R1-Zero
☆12,963Feb 27, 2026Updated last month