Official code release for "SuperBPE: Space Travel for Language Models"
β93May 28, 2026Updated last week
Alternatives and similar repositories for superbpe
Users that are interested in superbpe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL 2025] π Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignmentβ11Apr 6, 2025Updated last year
- β45Feb 11, 2026Updated 3 months ago
- Code for SaGe subword tokenizer (EACL 2023)β28Nov 30, 2024Updated last year
- Some utility functions to help myself (and perhaps others) go faster with ML/AI workβ50Jun 2, 2026Updated last week
- Welcome to our repository! This repository hosts the data on "IndoCollex: A Testbed for Morphological Transformation of Indonesian Word β¦β24Aug 10, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Simple-to-use scoring function for arbitrarily tokenized texts.β48Feb 19, 2025Updated last year
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxiaβ¦β30Jul 24, 2025Updated 10 months ago
- Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"β26Jun 3, 2025Updated last year
- β15Jul 9, 2025Updated 11 months ago
- Anh - LAION's multilingual assistant datasets and modelsβ28Apr 5, 2023Updated 3 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based β¦β12Mar 18, 2023Updated 3 years ago
- Expert Specialization MoE Solution based on CUTLASSβ27Apr 14, 2026Updated last month
- FlexiTokensβ23Dec 27, 2025Updated 5 months ago
- Can LLMs generate code-mixed sentences through zero-shot prompting?β11Apr 18, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- β12Dec 13, 2022Updated 3 years ago
- β109Jun 2, 2025Updated last year
- Use the tokenizer in parallel to achieve superior accelerationβ20Mar 21, 2024Updated 2 years ago
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curationβ81May 2, 2025Updated last year
- PathPiece tokenizerβ14Nov 10, 2024Updated last year
- β16May 8, 2024Updated 2 years ago
- β42Sep 20, 2022Updated 3 years ago
- Quora Paraphrasing Dataset Bahasa Indonesia Versionβ11Apr 18, 2021Updated 5 years ago
- Curriculum trainingβ22Jun 25, 2025Updated 11 months ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- β13Feb 7, 2023Updated 3 years ago
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.β92Jul 17, 2025Updated 10 months ago
- β16May 14, 2024Updated 2 years ago
- Code for the paper "Query-Key Normalization for Transformers"β53Mar 6, 2021Updated 5 years ago
- Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decodingβ20Nov 16, 2022Updated 3 years ago
- π« check your data, before you wreck your modelβ16Aug 11, 2022Updated 3 years ago
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's lβ¦β57Mar 31, 2026Updated 2 months ago
- Implementation of Cascaded Head-colliding Attention (ACL'2021)β11Sep 16, 2021Updated 4 years ago
- Efficient Transformers with Dynamic Token Poolingβ68May 20, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The training codes of Jasper-Token-Compression-600Mβ19Nov 19, 2025Updated 6 months ago
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Modelsβ15Mar 8, 2023Updated 3 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β196Jan 19, 2026Updated 4 months ago
- A collection of instruction data and scripts for machine translation.β20Sep 23, 2023Updated 2 years ago
- State-of-the-art paired encoder and decoder models (17M-1B params)β73Aug 6, 2025Updated 10 months ago
- β14Sep 10, 2021Updated 4 years ago
- Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)β27Mar 26, 2022Updated 4 years ago