hybridtheory / floc-simhashLinks
A fast python implementation of the SimHash algorithm.
β27Updated 4 years ago
Alternatives and similar repositories for floc-simhash
Users that are interested in floc-simhash are comparing it to the libraries listed below
Sorting:
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidataβ169Updated 3 years ago
- π Additional lookup tables and data resources for spaCyβ113Updated 6 months ago
- β69Updated 3 years ago
- A machine learning tool for fishing entitiesβ266Updated 6 months ago
- Fuzzy matching and more functionality for spaCy.β259Updated last year
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidataβ95Updated 2 years ago
- Sentence transformers models for SpaCyβ109Updated 2 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)β155Updated 2 years ago
- Information extraction from English and German texts based on predicate logicβ139Updated 2 years ago
- Language detection using Spacy and Fasttextβ57Updated 2 years ago
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-sβ¦β220Updated 11 months ago
- A comprehensive and scalable set of string tokenizers and similarity measures in Pythonβ142Updated last year
- Dataframe Integration with spaCy.β103Updated 4 years ago
- Blazing fast topic modelling for short texts.β34Updated 2 months ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.β62Updated this week
- Abydos NLP/IR library for Pythonβ193Updated 3 years ago
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machineβ194Updated 3 weeks ago
- Lightning Fast Language Prediction πβ167Updated 3 months ago
- A Flexible Deep Learning Approach to Fuzzy String Matchingβ149Updated last year
- β70Updated 3 years ago
- Text tokenization and sentence segmentation (segtok v2)β208Updated 3 years ago
- DaCy: The State of the Art Danish NLP pipeline using SpaCyβ99Updated 11 months ago
- Google USE (Universal Sentence Encoder) for spaCyβ184Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Updated last year
- πΈ fastText + Bloom embeddings for compact, full-coverage vectors with spaCyβ329Updated 7 months ago
- β30Updated 3 years ago
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.β120Updated 2 months ago
- βοΈ Parallel and distributed training with spaCy and Rayβ56Updated 2 years ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further langβ¦β131Updated last year
- A spaCy wrapper for DBpedia Spotlightβ112Updated 2 years ago