Fast and customizable tokenization
☆67Jul 9, 2019Updated 6 years ago
Alternatives and similar repositories for tok
Users that are interested in tok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Scripts as a service. Builds on systemd (for Linux)☆21Mar 10, 2026Updated 3 months ago
- Large-scale topic discovery with Sampled-MinHashing☆10Jul 3, 2019Updated 7 years ago
- A library for parsing security advisories☆14Apr 13, 2026Updated 2 months ago
- An open-source NLP library: fast text cleaning and preprocessing☆23Nov 9, 2021Updated 4 years ago
- Constrained Optimization in Pytorch☆12Feb 25, 2020Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Source codes for our paper "Neural Temporality Adaptation for Document Classification: Diachronic Word Embeddings and Domain Adaptation M…☆12Apr 20, 2021Updated 5 years ago
- A Django project to help users to create free, fast and secure blogs on GitHub Pages and Jekyll.☆20Dec 8, 2022Updated 3 years ago
- An index data structure for approximate string search.☆23May 6, 2019Updated 7 years ago
- Low-effort reachability analysis for third-party code vulnerabilities.☆22Jul 11, 2023Updated 2 years ago
- Supporting example for "A Rust SentencePiece implementation"☆20Jun 7, 2020Updated 6 years ago
- discover information about upstream projects☆19Jun 17, 2026Updated 2 weeks ago
- PyTorch tool for training with bigger batch size on the GPU☆11Feb 26, 2021Updated 5 years ago
- Go library for inspecting Rust binaries produced with https://github.com/rust-secure-code/cargo-auditable☆25Feb 26, 2025Updated last year
- My collection of miscellaneous source code☆35Aug 31, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- curl for websockets☆30Mar 14, 2016Updated 10 years ago
- Audit python packages for known vulnerabilities☆34Mar 9, 2022Updated 4 years ago
- A text parser.☆34Jun 20, 2026Updated 2 weeks ago
- Salient Open Information Extraction☆20Nov 14, 2018Updated 7 years ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python. ➡️ The project has moved to: https://gitlab.opencode…☆21Mar 20, 2026Updated 3 months ago
- Lightweight license checker.☆31Nov 5, 2020Updated 5 years ago
- Debian packaging tools☆47Mar 9, 2021Updated 5 years ago
- Universal Cross Package Manager - allows you to use the manifest file to download packages of various formats from different storage loca…☆36Nov 11, 2024Updated last year
- A simple in-memory graph database (wrapper for python-igraph)☆11Jul 6, 2019Updated 6 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Stanford CoreNLP examples in Scala☆11Jan 12, 2017Updated 9 years ago
- Datasets for compositional learning☆11Nov 28, 2018Updated 7 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆157Mar 8, 2026Updated 3 months ago
- Class frequency estimation software package☆13Sep 1, 2019Updated 6 years ago
- A supplementary code for Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs.☆47Nov 2, 2019Updated 6 years ago
- Train a model, and detect gibberish strings with it.☆68Feb 17, 2022Updated 4 years ago
- A library to instantiate any Python object from configuration files.☆25Oct 12, 2022Updated 3 years ago
- This repository contains code for the paper "Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs" (Wang, Lawrence…☆17Mar 8, 2021Updated 5 years ago
- ☆21Jun 3, 2019Updated 7 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Allows you to disable and enable map files for PUBG☆10Mar 11, 2018Updated 8 years ago
- Bytecode Analysis Toolkit.☆18Oct 28, 2022Updated 3 years ago
- A Python Wrapper To Retrieve Data From The CrowdTangle API☆11Mar 26, 2026Updated 3 months ago
- Vendy is a tool for vendoring third-party packages into your project.☆19Nov 28, 2023Updated 2 years ago
- Digital Forensics Windows Registry (dfWinReg)☆53May 27, 2026Updated last month
- Code for processing brain data☆12Apr 5, 2019Updated 7 years ago
- The main feature flipper library and web admin application.☆10Aug 18, 2025Updated 10 months ago