Fast tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and WordPiece tokenization in JavaScript, Python and Rust.
☆49May 10, 2026Updated 2 weeks ago
Alternatives and similar repositories for kitoken
Users that are interested in kitoken are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Anthropic MCP go implementation☆19May 9, 2026Updated 2 weeks ago
- Trainable embedding transformation for confidence estimation, feature extraction, explainability and conversion from dense to sparse.☆28Jun 9, 2025Updated 11 months ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- A PHP library for writing, reading, and validating llms.txt Markdown files.☆24May 8, 2026Updated 2 weeks ago
- Normalize text string☆12Nov 6, 2018Updated 7 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- zero-vocab or low-vocab embeddings☆18Jul 17, 2022Updated 3 years ago
- A password validation and generation tool kit☆13Jan 7, 2023Updated 3 years ago
- Universal Utility Toolkit☆21Oct 12, 2024Updated last year
- Code for the paper "Multi-Field Adaptive Retrieval," a research project on a semi-structured document retrieval☆17Feb 13, 2026Updated 3 months ago
- The omegaUp sandbox☆14Feb 13, 2023Updated 3 years ago
- ☆21Apr 16, 2024Updated 2 years ago
- Private self-improvement coaching with open-source LLMs☆17Mar 7, 2024Updated 2 years ago
- CMS 230 - Computer Organization and Architecture☆10Sep 6, 2024Updated last year
- Model implementation for the contextual embeddings project☆47Jun 2, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Fluent dreaming for language models☆13Jul 22, 2024Updated last year
- PostHog with text analytics extensions, serving as an advanced LLM analytics platform.☆15Sep 17, 2024Updated last year
- Serve HTTP on a tailnet☆22Oct 31, 2024Updated last year
- A CLI tool for running AI agents inside microVM sandboxes☆41May 8, 2026Updated 2 weeks ago
- Create a palette of N colors or convert True Color images to indexed ones. Includes png2gpl and png2act.☆17Apr 25, 2026Updated last month
- Kafka Connect Vespa sink connector☆17Apr 17, 2025Updated last year
- Website for TREC RAG☆14Apr 24, 2026Updated last month
- Website for Applied-LLMs work☆29May 5, 2026Updated 2 weeks ago
- This repo contains the source code for https://pest.rs☆13Mar 12, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Connect Client SDK and CLI☆19Mar 11, 2026Updated 2 months ago
- Definition files for Nissan engine control computers to be used with the Rom Raider Ecu Editor.☆19Nov 27, 2023Updated 2 years ago
- 🤖 AI is with you.☆14Nov 14, 2024Updated last year
- ✂️ OpenAI's tiktoken tokenizer written in Go☆20Jan 31, 2025Updated last year
- Guichan is a C++ GUI library designed for games.☆14Oct 22, 2025Updated 7 months ago
- A nats micro service interacting with Ollama☆18Jun 30, 2024Updated last year
- A tiny utility to help save you a lot of effort with long winded `#[cfg()]` checks in Rust.☆97Apr 16, 2025Updated last year
- Pre-training BART in Flax on The Pile dataset☆22Jul 24, 2021Updated 4 years ago
- Descriptor Vector Exchange☆76Oct 24, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Nature's Cost Function (NCF). Finding paths of least action with gradient descent.☆18Mar 30, 2023Updated 3 years ago
- A safer, drop-in replacement for Go's syscall/js JavaScript package.☆19Mar 5, 2023Updated 3 years ago
- Erku is an IPTV and video on demand client for the Roku OS.☆12Dec 29, 2024Updated last year
- Tulp is a command-line tool that can help you create and process piped content using the power of ChatGPT directly from the terminal.☆20Jan 20, 2026Updated 4 months ago
- Handling of multiple types of media documents for Django☆28Nov 9, 2015Updated 10 years ago
- Node and Browser env supported WebAssembly version of fastText: Library for efficient text classification and representation learning.☆14Sep 17, 2024Updated last year
- 一个适用于cursor 0.43版本以后的Cursor聊天记录提取工具,代码使用cursor自动生成,主要用于将Cursor编辑器中的AI聊天记录提取并转换为Markdown文件的工具。☆18Feb 8, 2025Updated last year