kootenpv/tok

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kootenpv/tok)

kootenpv / tok

Fast and customizable tokenization

☆67

Alternatives and similar repositories for tok

Users that are interested in tok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kootenpv / rebrand
View on GitHub
Refactor your software using programming language independent, case-preserving string replacement
☆17Jul 9, 2019Updated 7 years ago
kootenpv / textsearch
View on GitHub
Find strings/words in text; convenience and C speed
☆126Sep 2, 2022Updated 3 years ago
kootenpv / sysdm
View on GitHub
Scripts as a service. Builds on systemd (for Linux)
☆21Mar 10, 2026Updated 4 months ago
iMerica / pipflow
View on GitHub
Cloud native Python package manager.
☆10May 23, 2023Updated 3 years ago
philips-software / license-scanner
View on GitHub
Service to scan licenses from source code
☆12Aug 14, 2023Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
eelcovdw / pytorch-constrained-opt
View on GitHub
Constrained Optimization in Pytorch
☆12Feb 25, 2020Updated 6 years ago
omnilib / attribution
View on GitHub
Generate changelogs from commit tags and shortlogs
☆28Nov 2, 2025Updated 8 months ago
xiaoleihuang / Neural_Temporality_Adaptation
View on GitHub
Source codes for our paper "Neural Temporality Adaptation for Document Classification: Diachronic Word Embeddings and Domain Adaptation M…
☆12Apr 20, 2021Updated 5 years ago
mjvm / pyrpm
View on GitHub
A pure python rpm reader
☆20Apr 11, 2024Updated 2 years ago
FFY00 / python-resolver
View on GitHub
A Python dependency resolver
☆25Jul 13, 2026Updated last week
fujimotos / TinyFastSS
View on GitHub
An index data structure for approximate string search.
☆23May 6, 2019Updated 7 years ago
singh1114 / theJekyllProject
View on GitHub
A Django project to help users to create free, fast and secure blogs on GitHub Pages and Jekyll.
☆20Dec 8, 2022Updated 3 years ago
malinoff / amqproto
View on GitHub
☆22Feb 4, 2019Updated 7 years ago
google / tensorflow-tools
View on GitHub
A collection of manipulation tools for TensorFlow data.
☆17Jan 20, 2018Updated 8 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
guillaume-be / SentencePiece-Rust-example
View on GitHub
Supporting example for "A Rust SentencePiece implementation"
☆20Jun 7, 2020Updated 6 years ago
duo-labs / narrow
View on GitHub
Low-effort reachability analysis for third-party code vulnerabilities.
☆22Jul 11, 2023Updated 3 years ago
jeffkaufman / whistle-synth
View on GitHub
zero-crossing based pitch detection for whistling
☆18Feb 21, 2026Updated 5 months ago
dutc / penv
View on GitHub
`penv`: a better `env` & `venv`
☆12May 24, 2017Updated 9 years ago
timothycrosley / preconvert
View on GitHub
A Library to enable preconversion of any Python type into one that is easily serializable
☆18Dec 8, 2022Updated 3 years ago
wernsey / miscsrc
View on GitHub
My collection of miscellaneous source code
☆38Aug 31, 2025Updated 10 months ago
illikainen / ossaudit
View on GitHub
Audit python packages for known vulnerabilities
☆34Mar 9, 2022Updated 4 years ago
mponza / SalIE
View on GitHub
Salient Open Information Extraction
☆20Nov 14, 2018Updated 7 years ago
openredact / nerwhal
View on GitHub
This is a prototype of a multi-lingual suite for named-entity recognition in Python. ➡️ The project has moved to: https://gitlab.opencode…
☆21Mar 20, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
stephantul / piecelearn
View on GitHub
Learning BPE embeddings by first learning a segmentation model and then training word2vec
☆19Dec 18, 2022Updated 3 years ago
cantools / textparser
View on GitHub
A text parser.
☆34Jun 20, 2026Updated last month
evilsocket / twitter-num-followers-bot
View on GitHub
A bot that'll monitor the number of followers of its followers and tweet when the counter gets to interesting values.
☆13Jun 10, 2018Updated 8 years ago
soaxelbrooke / python-bpe
View on GitHub
Byte Pair Encoding for Python!
☆234Sep 16, 2022Updated 3 years ago
i-machine-think / machine-tasks
View on GitHub
Datasets for compositional learning
☆11Nov 28, 2018Updated 7 years ago
google / weighted-dict
View on GitHub
☆19May 25, 2020Updated 6 years ago
slanglab / freq-e
View on GitHub
Class frequency estimation software package
☆13Sep 1, 2019Updated 6 years ago
evilsocket / octofairy
View on GitHub
A machine learning based GitHub bot for Issues.
☆14Dec 6, 2018Updated 7 years ago
stanis-morozov / prodige
View on GitHub
A supplementary code for Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs.
☆47Nov 2, 2019Updated 6 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
robario / elscreen-persist
View on GitHub
persist the elscreen across sessions
☆11May 6, 2016Updated 10 years ago
Comcast / weasel
View on GitHub
Lightweight license checker.
☆31Nov 5, 2020Updated 5 years ago
Sreyan88 / DALE
View on GitHub
Code for EMNLP 2023 paper: DALE: Generative Data Augmentation for Low-Resource Legal NLP
☆11Oct 27, 2023Updated 2 years ago
nec-research / st_tau
View on GitHub
This repository contains code for the paper "Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs" (Wang, Lawrence…
☆17Mar 8, 2021Updated 5 years ago
devopshq / crosspm
View on GitHub
Universal Cross Package Manager - allows you to use the manifest file to download packages of various formats from different storage loca…
☆36Nov 11, 2024Updated last year
ChristophAlt / tuna
View on GitHub
Hyperparameter search for AllenNLP - powered by Ray TUNE
☆28Mar 6, 2025Updated last year
pudo / normality
View on GitHub
A tiny library for Python text normalisation. Useful for ad-hoc text processing.
☆158Mar 8, 2026Updated 4 months ago