iwpnd / flashgeotext
Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.
β60Updated this week
Related projects β
Alternatives and complementary repositories for flashgeotext
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.β123Updated 7 months ago
- 𧬠A VS Code extension for annotating data with Prodigyβ30Updated 2 years ago
- Language detection using Spacy and Fasttextβ54Updated 10 months ago
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the sameβ¦β28Updated last year
- A browser user interface for manual labeling of record pairs.β41Updated last year
- β70Updated last year
- Python package for deduplication/entity resolution using active learningβ79Updated 2 months ago
- An End-to-End Evaluation Framework for Entity Resolution Systemsβ25Updated 11 months ago
- β29Updated 2 years ago
- βοΈ Parallel and distributed training with spaCy and Rayβ54Updated last year
- Python based Wikidata framework for easy dataframe extractionβ39Updated 11 months ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidataβ152Updated 2 years ago
- Bag of, not words, but tricks!β68Updated last year
- spaCy match and replace, maintaining conjugationβ34Updated last year
- Dataframe Integration with spaCy.β101Updated 3 years ago
- Scalable String Similarity Joins in Pythonβ39Updated 3 months ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidataβ91Updated last year
- Annotation Management for Prodigy, that support multiple users working in many projectsβ15Updated 5 years ago
- Generate reports for spaCy models.β28Updated 2 years ago
- Extract networks of entities from journalistic reportingβ47Updated last year
- A maximum-strength name parser for record linkage.β32Updated 3 months ago
- An open-source package for python to clean raw text dataβ69Updated last year
- β66Updated 2 years ago
- A comprehensive and scalable set of string tokenizers and similarity measures in Pythonβ137Updated 3 months ago
- A small python library that can clump lists of data together.β147Updated 2 years ago
- It's a cooler way to store simple linear models.β28Updated 3 months ago
- spaCy pipeline component for generating spaCy KnowledgeBase Alias Candidates for Entity Linkingβ84Updated 2 years ago
- List of entity resolution software and resources.β35Updated 8 months ago
- Create a Geonames gazetteer index in Elasticsearchβ74Updated last year