bminixhofer / tokenkitLinks
A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.
☆27Updated last month
Alternatives and similar repositories for tokenkit
Users that are interested in tokenkit are comparing it to the libraries listed below
Sorting:
- Code for Zero-Shot Tokenizer Transfer☆133Updated 5 months ago
- ☆57Updated 9 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆33Updated 3 months ago
- ☆26Updated 5 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆80Updated last month
- This is the official repository for Inheritune.☆111Updated 4 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆102Updated 2 months ago
- ☆47Updated 9 months ago
- Prune transformer layers☆69Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- ☆35Updated last year
- ☆61Updated 3 weeks ago
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆58Updated last month
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆26Updated 4 months ago
- ☆51Updated 7 months ago
- Supercharge huggingface transformers with model parallelism.☆77Updated 8 months ago
- ☆56Updated 3 weeks ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆140Updated 4 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)☆101Updated last year
- ☆49Updated last year
- ☆38Updated last year
- 🚢 Data Toolkit for Sailor Language Models☆93Updated 4 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆85Updated last year
- Common tools for data processing☆14Updated 2 months ago
- ☆80Updated 5 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆59Updated 10 months ago
- ☆47Updated 4 months ago
- Official repo for Learning to Reason for Long-Form Story Generation☆63Updated 2 months ago