owos/flexitokens

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/owos/flexitokens)

owos / flexitokens

FlexiTokens

☆23

Alternatives and similar repositories for flexitokens

Users that are interested in flexitokens are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

orevaahia / magnet-tokenization
View on GitHub
☆11Mar 17, 2026Updated 4 months ago
feyninc / tokie
View on GitHub
🍡 30x faster tokenization for every HuggingFace model
☆47May 28, 2026Updated last month
CLAIRE-Labo / RAT
View on GitHub
Official code for the NeurIPS25 paper "RAT: Bridging RNN Efficiencyand Attention Accuracy in Language Modeling" (https://arxiv.org/abs/25…
☆26Dec 10, 2025Updated 7 months ago
MeLeLBGU / SaGe
View on GitHub
Code for SaGe subword tokenizer (EACL 2023)
☆28Nov 30, 2024Updated last year
nateraw / spaces-docker-templates
View on GitHub
🚀🤗 A collection of templates for Hugging Face Spaces
☆35Oct 9, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
kensho-technologies / pathpiece
View on GitHub
PathPiece tokenizer
☆14Nov 10, 2024Updated last year
huggingface / ember
View on GitHub
ANE accelerated embedding models!
☆20Dec 11, 2024Updated last year
allenai / noncompliance
View on GitHub
This repository contains data, code and models for contextual noncompliance.
☆26Jul 18, 2024Updated 2 years ago
stellalisy / PrefPalette
View on GitHub
☆21Apr 3, 2026Updated 3 months ago
MinishLab / tokenlearn
View on GitHub
Pre-train Static Word Embeddings
☆108Jun 9, 2026Updated last month
commoncrawl / web-languages
View on GitHub
Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ …
☆71Jul 1, 2026Updated 2 weeks ago
zaydzuhri / flame
View on GitHub
Fork of Flame repo for training of some new stuff in development
☆20Updated this week
bmschmidt / pySRP
View on GitHub
Python Module implementing SRP
☆12Jul 29, 2022Updated 3 years ago
NayukiMafuyu / mockou
View on GitHub
2D chess pieces inspired by handmade wooden chess sets, featuring cuteness and simplicity
☆15Feb 14, 2026Updated 5 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
Tobs40 / chess218
View on GitHub
A code snippet that proves that there is no legal, potentially non-reachable chess position with more than 218 moves.
☆23May 23, 2024Updated 2 years ago
stephantul / skeletoken
View on GitHub
Datamodels for hugging face tokenizers
☆108Jun 18, 2026Updated last month
kylebgorman / EditTransducer
View on GitHub
Python implementation of Levenshtein distance and Levenshtein automata matching
☆27May 8, 2019Updated 7 years ago
VITA-Group / TAPE
View on GitHub
[ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…
☆15Jun 6, 2025Updated last year
stephantul / pynife
View on GitHub
Nearly Inference Free Embeddings: make your RAG queries 500x faster
☆80Apr 27, 2026Updated 2 months ago
AI21Labs / pmi-masking
View on GitHub
This repository includes the masking vocabulary used in the ICLR 2021 spotlight PMI-Masking paper
☆14Aug 9, 2021Updated 4 years ago
kaniblu / hanja-tagger
View on GitHub
Automatic Korean Hanja tagging tool powered by Hanjaro (hanjaro.juntong.or.kr)
☆19Feb 22, 2019Updated 7 years ago
yuweihao / LV-BERT
View on GitHub
LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)
☆18May 10, 2023Updated 3 years ago
AnswerDotAI / fastkmeans
View on GitHub
☆101Jul 4, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
skai-research / ScholarEval
View on GitHub
Official code and data for the paper "ScholarEval: Research Idea Evaluation Grounded in Literature."
☆20Oct 28, 2025Updated 8 months ago
qdrant / miniCOIL
View on GitHub
Contextualized per-token embeddings
☆37Updated this week
DunZhang / Jasper-Token-Compression-Training
View on GitHub
The training codes of Jasper-Token-Compression-600M
☆20Nov 19, 2025Updated 8 months ago
BaichuanSEED / BaichuanSEED.github.io
View on GitHub
Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…
☆18Aug 28, 2024Updated last year
PythonNut / superbpe
View on GitHub
Official code release for "SuperBPE: Space Travel for Language Models"
☆97May 28, 2026Updated last month
insait-institute / ritranslation
View on GitHub
[ACL'26 Findings] Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets
☆20Jun 27, 2026Updated 3 weeks ago
Knowledgator / FlashDeBERTa
View on GitHub
Trully flash implementation of DeBERTa disentangled attention mechanism.
☆90Feb 10, 2026Updated 5 months ago
timoschick / form-context-model
View on GitHub
This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.
☆30May 11, 2020Updated 6 years ago
unlp-workshop / unlp-2025-shared-task
View on GitHub
UNLP 2025 Shared Task on Detecting Social Media Manipulation
☆23Aug 4, 2025Updated 11 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
LAGoM-NLP / transtokenizer
View on GitHub
☆57Dec 27, 2025Updated 6 months ago
gesceap / prime16
View on GitHub
Nanoloop source files for the album "Prime 16"
☆11Mar 7, 2026Updated 4 months ago
wilseypa / warped
View on GitHub
Parallel & Distributed Simulation (discrete event)
☆26Nov 12, 2015Updated 10 years ago
asahi417 / lm-vocab-trimmer
View on GitHub
Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting ir…
☆67Oct 25, 2024Updated last year
Sreyan88 / ACLM
View on GitHub
Code for ACL 2023 Paper: ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER
☆22Jul 19, 2023Updated 3 years ago
stefan-it / ukrainian-electra
View on GitHub
Ukrainian ELECTRA model
☆12Mar 11, 2023Updated 3 years ago
aidos-lab / magnipy
View on GitHub
Metric Space Magnitude Computations
☆15Jun 30, 2026Updated 3 weeks ago