kuprel / minbpe-pytorch
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization, with PyTorch/CUDA
☆35Updated last year
Alternatives and similar repositories for minbpe-pytorch:
Users that are interested in minbpe-pytorch are comparing it to the libraries listed below
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 11 months ago
- ANE accelerated embedding models!☆17Updated 2 months ago
- ☆44Updated 7 months ago
- Efficiently computing & storing token n-grams from large corpora☆18Updated 4 months ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated last year
- ☆37Updated 2 years ago
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆17Updated last year
- A fork of llama3.c used to do some R&D on inferencing☆19Updated 2 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆54Updated last year
- Because it's there.☆15Updated 5 months ago
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- Standalone commandline CLI tool for compiling Triton kernels☆17Updated 5 months ago
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆37Updated last year
- RWKV-7: Surpassing GPT☆79Updated 3 months ago
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- new optimizer☆19Updated 6 months ago
- ☆34Updated last year
- ☆49Updated 11 months ago
- ☆22Updated 8 months ago
- Make triton easier☆45Updated 8 months ago
- ☆52Updated 10 months ago
- Training hybrid models for dummies.☆20Updated last month
- ☆16Updated 11 months ago
- ☆38Updated 7 months ago
- utilities for loading and running text embeddings with onnx☆44Updated 6 months ago
- Code repository for the paper "MrT5: Dynamic Token Merging for Efficient Byte-level Language Models."☆34Updated 2 weeks ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated 11 months ago
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆16Updated 4 months ago
- Lightweight tools for quick and easy LLM demo's☆26Updated 5 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆117Updated 3 months ago