hybridtheory / floc-simhash
A fast python implementation of the SimHash algorithm.
☆27Updated 3 years ago
Alternatives and similar repositories for floc-simhash:
Users that are interested in floc-simhash are comparing it to the libraries listed below
- Abydos NLP/IR library for Python☆185Updated 2 years ago
- Python package for deduplication/entity resolution using active learning☆78Updated 8 months ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆161Updated 2 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆94Updated 2 years ago
- ☆69Updated 3 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated last year
- ☆30Updated 2 years ago
- Sentence transformers models for SpaCy☆107Updated 2 years ago
- A comprehensive and scalable set of string tokenizers and similarity measures in Python☆137Updated 9 months ago
- Information extraction from English and German texts based on predicate logic☆135Updated last year
- BERT and ELECTRA models trained on Europeana Newspapers☆38Updated 3 years ago
- Hidden alignment conditional random field for classifying string pairs.☆24Updated 7 months ago
- Blazing fast topic modelling for short texts.☆31Updated 2 weeks ago
- A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF …☆66Updated 4 years ago
- ☄️ Parallel and distributed training with spaCy and Ray☆54Updated last year
- Train a model, and detect gibberish strings with it.☆61Updated 3 years ago
- A browser user interface for manual labeling of record pairs.☆47Updated last year
- 🌸 Train floret vectors☆18Updated last year
- Fuzzy matching and more functionality for spaCy.☆256Updated 9 months ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆145Updated 6 months ago
- A machine learning tool for fishing entities☆264Updated 3 weeks ago
- Language detection using Spacy and Fasttext☆55Updated last year
- A Named-Entity Recogniser based on Grobid.☆52Updated 7 months ago
- Library for unit extraction - fork of quantulum for python3☆138Updated 10 months ago
- Polyglot skipgram embeddings, and their many health benefits☆12Updated 5 years ago
- ☆70Updated 2 years ago
- 🐍 Python bidding for the Hora Approximate Nearest Neighbor Search Algorithm library☆72Updated 3 years ago
- spaCy entry points for Curated Transformers☆29Updated 6 months ago
- An efficient simhash implementation for python☆124Updated 5 years ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year