mammothb / editdistpyLinks
Fast edit distance Python extension written in Cython/C++. Supports Levenshtein distance and Damerau Optimal String Alignment (OSA) distance.
☆24Updated 2 months ago
Alternatives and similar repositories for editdistpy
Users that are interested in editdistpy are comparing it to the libraries listed below
Sorting:
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- Rust python bindings for symspell☆21Updated last year
- Multi-Langauge Identification☆28Updated last year
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.☆41Updated 2 years ago
- A simple neural truecaser written in pytorch and allennlp.☆33Updated last year
- A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and …☆51Updated 8 months ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.☆21Updated last year
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆33Updated last year
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- Open source library for few shot NLP☆79Updated 2 years ago
- ☆43Updated 2 years ago
- Tower Parse: Low-Resource Dependency Parsing via Hierarchical Source Selection☆15Updated 4 years ago
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆105Updated 3 years ago
- ☆87Updated 3 years ago
- Code for pre-training CharacterBERT models (as well as BERT models).☆34Updated 3 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆87Updated 4 months ago
- TorchServe+Streamlit for easily serving your HuggingFace NER models☆33Updated 3 years ago
- Pyinfer is a model agnostic tool for ML developers and researchers to benchmark the inference statistics for machine learning models or f…☆24Updated 4 years ago
- Documentation effort for the BookCorpus dataset☆34Updated 4 years ago
- zero-vocab or low-vocab embeddings☆18Updated 3 years ago
- ☆22Updated 3 years ago
- Source code for the Apple reproduction☆32Updated 4 years ago
- Repository with illustrations for cft-contest-2018☆12Updated 6 years ago
- Source code and data for Like a Good Nearest Neighbor☆30Updated 7 months ago
- Combining encoder-based language models☆11Updated 3 years ago
- A Streamlit component for annotating text by text selecting.☆40Updated last year
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆74Updated 2 months ago
- Implementation of pQRNN in PyTorch☆46Updated 3 years ago
- Neural network sequence labeling model☆11Updated 5 years ago
- ☆17Updated 2 years ago