alasdairforsythe / tokenmonster
Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
☆574Updated 8 months ago
Alternatives and similar repositories for tokenmonster:
Users that are interested in tokenmonster are comparing it to the libraries listed below
- ☆412Updated last year
- The repository for the code of the UltraFastBERT paper☆517Updated 11 months ago
- Tune any FALCON in 4-bit☆466Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers☆422Updated last year
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆690Updated 11 months ago
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆999Updated 7 months ago
- A bagel, with everything.☆317Updated 11 months ago
- batched loras☆340Updated last year
- Inference code for Persimmon-8B☆415Updated last year
- YaRN: Efficient Context Window Extension of Large Language Models☆1,450Updated 11 months ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆548Updated 2 months ago
- ☆457Updated last year
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆448Updated 11 months ago
- ☆541Updated 3 months ago
- ☆536Updated last year
- Code for fine-tuning Platypus fam LLMs using LoRA☆628Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆186Updated 7 months ago
- Customizable implementation of the self-instruct paper.☆1,039Updated last year
- Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs.☆381Updated 9 months ago
- [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization☆680Updated 7 months ago
- Prompt programming with FMs.☆440Updated 8 months ago
- 🤖 A PyTorch library of curated Transformer models and their composable components☆884Updated 11 months ago
- This repository contains code for extending the Stanford Alpaca synthetic instruction tuning to existing instruction-tuned models such as…☆351Updated last year
- Fine-tune mistral-7B on 3090s, a100s, h100s☆709Updated last year
- Generate textbook-quality synthetic LLM pretraining data☆498Updated last year
- SoTA Transformers with C-backend for fast inference on your CPU.☆311Updated last year
- [NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333☆1,095Updated last year
- Language Modeling with the H3 State Space Model☆516Updated last year
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"☆1,059Updated last year
- Finetuning Large Language Models on One Consumer GPU in 2 Bits☆719Updated 9 months ago