kuprel / minbpe-pytorchLinks
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization, with PyTorch/CUDA
☆41Updated last year
Alternatives and similar repositories for minbpe-pytorch
Users that are interested in minbpe-pytorch are comparing it to the libraries listed below
Sorting:
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆223Updated 2 weeks ago
- Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers☆153Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- Inference of Mamba models in pure C☆195Updated last year
- ☆135Updated last year
- Simple high-throughput inference library☆153Updated 7 months ago
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆39Updated 2 years ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆65Updated 2 years ago
- Python bindings for ggml☆146Updated last year
- inference code for mixtral-8x7b-32kseqlen☆104Updated 2 years ago
- ☆157Updated 2 years ago
- A relatively basic implementation of RWKV in Rust written by someone with very little math and ML knowledge. Supports 32, 8 and 4 bit eva…☆94Updated 2 years ago
- tinygrad port of the RWKV large language model.☆45Updated 9 months ago
- a small code base for training large models☆315Updated 7 months ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆183Updated last month
- ☆39Updated 3 years ago
- Alice in Wonderland code base for experiments and raw experiments data☆131Updated 3 months ago
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- LLaVA server (llama.cpp).☆183Updated 2 years ago
- First token cutoff sampling inference example☆31Updated last year
- Efficiently computing & storing token n-grams from large corpora☆26Updated last year
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆219Updated last year
- ☆19Updated last year
- A high-performance constrained decoding engine based on context free grammar in Rust☆56Updated 7 months ago
- Unofficial python bindings for the rust llm library. 🐍❤️🦀☆76Updated 2 years ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆279Updated 2 years ago
- GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.☆57Updated last year
- Simple (fast) transformer inference in PyTorch with torch.compile + lit-llama code☆10Updated 2 years ago
- Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and…☆39Updated 2 months ago