kuprel / minbpe-pytorchLinks
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization, with PyTorch/CUDA
☆37Updated last year
Alternatives and similar repositories for minbpe-pytorch
Users that are interested in minbpe-pytorch are comparing it to the libraries listed below
Sorting:
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆38Updated last year
- ANE accelerated embedding models!☆17Updated 5 months ago
- ☆35Updated 2 years ago
- GGML implementation of BERT model with Python bindings and quantization.☆55Updated last year
- NLP with Rust for Python 🦀🐍☆62Updated 3 weeks ago
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- Lightweight tools for quick and easy LLM demo's☆27Updated 8 months ago
- A fork of llama3.c used to do some R&D on inferencing☆22Updated 5 months ago
- Latent Large Language Models☆18Updated 9 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- utilities for loading and running text embeddings with onnx☆44Updated 10 months ago
- Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and…☆25Updated 2 months ago
- Training hybrid models for dummies.☆21Updated 4 months ago
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback☆94Updated 2 months ago
- 👷 Build compute kernels☆44Updated this week
- ☆38Updated 10 months ago
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆28Updated last year
- GPU accelerated client-side embeddings for vector search, RAG etc.☆66Updated last year
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆147Updated this week
- Data preparation code for CrystalCoder 7B LLM☆44Updated last year
- inference code for mixtral-8x7b-32kseqlen☆100Updated last year
- Trying to deconstruct RWKV in understandable terms☆14Updated 2 years ago
- iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh☆47Updated last year
- Experiments for efforts to train a new and improved t5☆77Updated last year
- ☆54Updated last year
- ☆34Updated 11 months ago
- Make triton easier☆47Updated 11 months ago
- Standalone commandline CLI tool for compiling Triton kernels☆18Updated 8 months ago
- look how they massacred my boy☆63Updated 7 months ago