Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and WordPiece tokenization in JavaScript, Python and Rust.
☆46Mar 10, 2026Updated last week
Alternatives and similar repositories for kitoken
Users that are interested in kitoken are comparing it to the libraries listed below
Sorting:
- Anthropic MCP go implementation☆19Updated this week
- PHP low-level client for Vespa. https://vespa.ai/☆17Jan 22, 2026Updated 2 months ago
- Automatically exported from code.google.com/p/esaxx☆17Jun 23, 2015Updated 10 years ago
- ☆12Apr 29, 2022Updated 3 years ago
- Nadir: Cutting-edge PyTorch optimizers for simplicity & composability! 🔥🚀💻☆14Jun 15, 2024Updated last year
- zero-vocab or low-vocab embeddings☆18Jul 17, 2022Updated 3 years ago
- ☆21Apr 16, 2024Updated last year
- Model implementation for the contextual embeddings project☆42Jun 2, 2025Updated 9 months ago
- CMS 230 - Computer Organization and Architecture☆11Sep 6, 2024Updated last year
- A CLI tool for running AI agents inside microVM sandboxes☆34Mar 16, 2026Updated last week
- Kafka Connect Vespa sink connector☆17Apr 17, 2025Updated 11 months ago
- Repository containing code for the NAACL 2021 paper (Incorporating External Knowledge to Enhance Tabular Reasoning)☆17Jun 20, 2021Updated 4 years ago
- Website for TREC RAG☆14Aug 19, 2025Updated 7 months ago
- A Python utility for indexing file lines. Best demo honourable mention at ECIR 2024.☆23Nov 9, 2025Updated 4 months ago
- ☆14Sep 30, 2021Updated 4 years ago
- Website for Applied-LLMs work☆28Jan 13, 2026Updated 2 months ago
- Connect Client SDK and CLI☆18Mar 11, 2026Updated last week
- This repo contains the source code for https://pest.rs☆13Mar 12, 2026Updated last week
- Detect and redact PII locally with SOTA performance☆95Mar 25, 2025Updated 11 months ago
- 👩 Pytorch and Jax code for the Madam optimiser.☆53Feb 9, 2021Updated 5 years ago
- An R package to convert SingeCellExperiment and Seurat objects into anndata as comprehensively as possible.☆11Apr 23, 2025Updated 11 months ago
- Descriptor Vector Exchange☆76Oct 24, 2019Updated 6 years ago
- Nature's Cost Function (NCF). Finding paths of least action with gradient descent.☆18Mar 30, 2023Updated 2 years ago
- A safer, drop-in replacement for Go's syscall/js JavaScript package.☆19Mar 5, 2023Updated 3 years ago
- Authentication Callout Library☆22Updated this week
- Tulp is a command-line tool that can help you create and process piped content using the power of ChatGPT directly from the terminal.☆20Jan 20, 2026Updated 2 months ago
- Web client for Vespa.ai☆54Jul 2, 2025Updated 8 months ago
- Node and Browser env supported WebAssembly version of fastText: Library for efficient text classification and representation learning.☆13Sep 17, 2024Updated last year
- Conditional Random Fields implemented as Lasagne layer☆10Jul 22, 2016Updated 9 years ago
- An innovative framework designed to enhance the art of storytelling.☆23Dec 1, 2023Updated 2 years ago
- Narwhal is a keyword and KEY NARRATIVE manager that creates language-aware classes. Because Narhwal does not use NLP it avoids complexity…☆12Oct 16, 2018Updated 7 years ago
- The home of official Obot tools☆34Updated this week
- Quit Datasette if it has not received traffic for a specified time period☆17Feb 18, 2026Updated last month
- ☆10Mar 2, 2022Updated 4 years ago
- ☆23Jul 8, 2025Updated 8 months ago
- Modular Element SSR with Hydration☆43Sep 23, 2025Updated 5 months ago
- This is the repo for all cell sorting code and data☆41Oct 30, 2024Updated last year
- A kubernetes probe which checks model status for a TensorFlow Serving model☆18Jul 8, 2022Updated 3 years ago
- Unsupervised Word Discovery☆10Jul 26, 2019Updated 6 years ago