alasdairforsythe / tokenmonsterLinks

Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript

☆606

Alternatives and similar repositories for tokenmonster

Users that are interested in tokenmonster are comparing it to the libraries listed below

Sorting:

pbelcak / UltraFastBERT
The repository for the code of the UltraFastBERT paper
☆520Updated last year
rmihaylov / falcontune
Tune any FALCON in 4-bit
☆465Updated 2 years ago
SkunkworksAI / hydra-moe
☆415Updated 2 years ago
sabetAI / BLoRA
batched loras
☆347Updated 2 years ago
PiotrNawrot / nanoT5
Fast & Simple repository for pre-training and fine-tuning T5-style models
☆1,014Updated last year
jondurbin / bagel
A bagel, with everything.
☆325Updated last year
abacaj / fine-tune-mistral
Fine-tune mistral-7B on 3090s, a100s, h100s
☆717Updated 2 years ago
tysam-code / hlb-gpt
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…
☆352Updated last year
tomaarsen / attention_sinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
☆732Updated last year
epfml / landmark-attention
Landmark Attention: Random-Access Infinite Context Length for Transformers
☆427Updated last year
HazyResearch / H3
Language Modeling with the H3 State Space Model
☆519Updated 2 years ago
jondurbin / airoboros
Customizable implementation of the self-instruct paper.
☆1,050Updated last year
persimmon-ai-labs / adept-inference
Inference code for Persimmon-8B
☆412Updated 2 years ago
johnsmith0031 / alpaca_lora_4bit
☆535Updated 2 years ago
kuleshov-group / llmtools
Finetuning Large Language Models on One Consumer GPU in 2 Bits
☆733Updated last year
HazyResearch / m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
☆561Updated 11 months ago
FastEval / FastEval
Fast & more realistic evaluation of chat language models. Includes leaderboard.
☆189Updated last year
zphang / minimal-llama
☆457Updated 2 years ago
abertsch72 / unlimiformer
Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
☆1,063Updated last year
NolanoOrg / cformers
SoTA Transformers with C-backend for fast inference on your CPU.
☆309Updated last year
sanjeevanahilan / nanoChatGPT
A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick
☆293Updated 2 years ago
VikParuchuri / textbook_quality
Generate textbook-quality synthetic LLM pretraining data
☆507Updated 2 years ago
explosion / curated-transformers
🤖 A PyTorch library of curated Transformer models and their composable components
☆893Updated last year
NouamaneTazi / bloomz.cpp
C++ implementation for BLOOM
☆807Updated 2 years ago
QuixiAI / laserRMT
This is our own implementation of 'Layer Selective Rank Reduction'
☆240Updated last year
apoorvumang / prompt-lookup-decoding
☆580Updated last year
imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆202Updated last year
mlfoundations / open_lm
A repository for research on medium sized language models.
☆520Updated 5 months ago
salesforce / xgen
Salesforce open-source LLMs with 8k sequence length.
☆722Updated 10 months ago
abacaj / train-with-fsdp
☆94Updated 2 years ago