nap / jaro-winkler-distance
Finds the Jaro Winkler Distance indicating a distance or similarity score between two strings.
☆26Updated this week
Alternatives and similar repositories for jaro-winkler-distance:
Users that are interested in jaro-winkler-distance are comparing it to the libraries listed below
- Json Wikipedia, contains code to convert the Wikipedia xml dump into a json dump. Questions? https://gitter.im/idio-opensource/Lobby☆17Updated 2 years ago
- An author identification system based on recur☆21Updated 8 years ago
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- This repository contains the DFKI Product Corpus, a dataset of 174 documents annotated for product and company named entities, and the re…☆12Updated 5 months ago
- A web application tagging and retrieval of arguments in text☆29Updated last year
- Use ML-Annotate to label data for machine learning purposes☆107Updated 4 years ago
- allennlp + streamlit demo☆22Updated 5 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆51Updated 4 years ago
- Original, standard and customisable versions of the Jaro-Winkler functions.☆32Updated 2 years ago
- fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-ha…☆39Updated 2 years ago
- Reverse engineer patterns for use with SpaCy's DependencyMatcher☆35Updated 5 years ago
- ☆21Updated 2 years ago
- Toolkit with state-of-the-art Automatic Terms Recognition methods in Scala☆35Updated 6 years ago
- ☆30Updated 2 years ago
- Tokenize and clean strings in Python☆12Updated 7 years ago
- Load embeddings and featurize your sentences.☆28Updated 3 months ago
- Be notified of recent events in the news by setting up alerts. Program uses NLP techniques such as keyword matching, k-clustering and sem…☆11Updated 8 years ago
- Sentence embeddings for unsupervised event detection in the Twitter stream: study on English and French corpora☆31Updated this week
- Polyglot skipgram embeddings, and their many health benefits☆12Updated 5 years ago
- Stanford Tregex-inspired language for rule-based dependency tree manipulation.☆21Updated 7 years ago
- Language detection using Spacy and Fasttext☆55Updated last year
- Character Based Named Entity Recognition.☆40Updated 6 years ago
- Storage and retrieval of Word Embeddings in various databases☆51Updated 6 years ago
- Scripts for building and deploying ConceptNet, using Packer and Puppet☆10Updated 3 years ago
- LEMON: Explainable Entity Matching☆18Updated 2 years ago
- Text readability metrics in Python.☆11Updated 11 years ago
- code and data used to build a training dataset for dragnet models☆10Updated 4 years ago
- ☆16Updated last year
- A tool for detecting sentence fragments.☆7Updated 8 years ago
- GSDMM: Short text clustering (Rust implementation)☆23Updated last year