mammothb / editdistpy
Fast edit distance Python extension written in Cython/C++. Supports Levenshtein distance and Damerau Optimal String Alignment (OSA) distance.
â23Updated 6 months ago
Alternatives and similar repositories for editdistpy:
Users that are interested in editdistpy are comparing it to the libraries listed below
- Execute arbitrary SQL queries on đ¤ Datasetsâ32Updated last year
- Source code for the Apple reproductionâ32Updated 3 years ago
- zero-vocab or low-vocab embeddingsâ18Updated 2 years ago
- Combining encoder-based language modelsâ11Updated 3 years ago
- Tower Parse: Low-Resource Dependency Parsing via Hierarchical Source Selectionâ15Updated 3 years ago
- Large-scale query-focused multi-document Summarization datasetâ10Updated 3 years ago
- Rust-based Python wrapper for duckling library in Haskellâ25Updated 4 years ago
- TorchServe+Streamlit for easily serving your HuggingFace NER modelsâ32Updated 2 years ago
- PyTorch-IE: State-of-the-art Information Extraction in PyTorchâ77Updated last week
- A simple neural truecaser written in pytorch and allennlp.â33Updated 8 months ago
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extractionâ24Updated 2 years ago
- â17Updated last year
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.â21Updated 10 months ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languagesâ13Updated 2 years ago
- Keras Implementation of Flair's Contextualized Embeddingsâ26Updated 3 years ago
- Robust Cross-lingual Embeddings from Parallel Sentencesâ22Updated 4 years ago
- Pyinfer is a model agnostic tool for ML developers and researchers to benchmark the inference statistics for machine learning models or fâŚâ24Updated 4 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.â22Updated 2 years ago
- An open-source NLP library: fast text cleaning and preprocessingâ23Updated 3 years ago
- Implementation of pQRNN in PyTorchâ46Updated 3 years ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR modelsâ31Updated 3 years ago
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.â39Updated last year
- Context Encoders (ConEc) as a simple but powerful extension of the word2vec model for learning word embeddingsâ21Updated 4 years ago
- WebRED is a large and diverse manually annotated dataset for extracting relationships from a variety of text found on the World Wide Web.â22Updated 4 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.â15Updated 2 years ago
- Cosine Similary Search in ElasticSearch + FAISS GPUâ12Updated 2 years ago
- BERT models for many languages created from Wikipedia textsâ33Updated 4 years ago
- Learning BPE embeddings by first learning a segmentation model and then training word2vecâ19Updated 2 years ago
- Correction of spaces with character-based neural language models.â13Updated 2 years ago
- â16Updated last year