kerighan / eldar
Boolean text search in Python
☆45Updated 2 years ago
Alternatives and similar repositories for eldar:
Users that are interested in eldar are comparing it to the libraries listed below
- ☆54Updated last year
- RaKUn 2.0 - A fast keyword detection algorithm☆66Updated 2 months ago
- Python package for deduplication/entity resolution using active learning☆78Updated 7 months ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆124Updated 3 months ago
- Generate a SQLite database from Wikipedia & Wikidata dumps.☆33Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆118Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- A collection of network-related python utilities.☆16Updated last year
- A python package to simulate typographical errors.☆33Updated last year
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆70Updated 2 months ago
- Tools for interactive visual exploration of semantic embeddings.☆32Updated 7 months ago
- An open-source package for python to clean raw text data☆69Updated last year
- an experimental implementation of Burrow's delta in Python 3☆21Updated 3 years ago
- Embedding Vector Oriented Clustering☆134Updated last week
- Package to extract connotation frames☆84Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆153Updated 10 months ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- HDBSCAN Tuning for BERTopic Models☆45Updated last year
- 🔤 Measure edit distance based on keyboard layout☆60Updated last year
- Robust and fast topic models with sentence-transformers.☆48Updated last week
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆71Updated 2 years ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- 🔢 Work with static vector models☆27Updated 2 months ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆122Updated 11 months ago
- Blazing fast fuzzy text search for Python.☆44Updated 2 months ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated 3 months ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters☆136Updated 3 months ago
- Faster, modernized fork of the language identification tool langid.py☆55Updated 4 months ago
- ☆69Updated 3 years ago