gautierdag / bpeasyLinks

Fast bare-bones BPE for modern tokenizer training

☆170

Alternatives and similar repositories for bpeasy

Users that are interested in bpeasy are comparing it to the libraries listed below

Sorting:

tysam-code / hlb-gpt
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…
☆352Updated last year
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆240Updated 2 months ago
erfanzar / EasyDeL
Accelerate, Optimize performance with streamlined training and serving options with JAX.
☆323Updated this week
srush / GPTWorld
A puzzle to learn about prompting
☆135Updated 2 years ago
llm-efficiency-challenge / neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
☆257Updated 2 years ago
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆173Updated 5 months ago
apple / ml-sigma-reparam
☆311Updated last year
allenai / fm-cheatsheet
Website for hosting the Open Foundation Models Cheat Sheet.
☆269Updated 6 months ago
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆232Updated last year
jxmorris12 / cde
code for training & evaluating Contextual Document Embedding models
☆200Updated 6 months ago
mcleish7 / arithmetic
Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)
☆194Updated last year
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆277Updated last year
srush / annotated-mamba
Annotated version of the Mamba paper
☆491Updated last year
mlfoundations / open_lm
A repository for research on medium sized language models.
☆520Updated 5 months ago
NVIDIA / ngpt
Normalized Transformer (nGPT)
☆194Updated last year
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆108Updated 8 months ago
ayaka14732 / llama-2-jax
JAX implementation of the Llama 2 model
☆216Updated last year
normster / llm_rules
RuLES: a benchmark for evaluating rule-following in language models
☆240Updated 9 months ago
xjdr-alt / simple_transformer
Simple Transformer in Jax
☆139Updated last year
changjonathanc / flex-nano-vllm
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
☆305Updated 3 weeks ago
cloneofsimo / min-max-gpt
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆132Updated last year
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆110Updated 2 weeks ago
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆133Updated 11 months ago
justinchiu / openlogprobs
Extract full next-token probabilities via language model APIs
☆248Updated last year
EleutherAI / cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
☆826Updated 3 months ago
HazyResearch / based
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
☆243Updated 5 months ago
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
☆87Updated last year
google-deepmind / nanodo
☆285Updated last year
cloneofsimo / min-fsdp
☆91Updated last year
lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆168Updated 9 months ago