A minimalist but optimized Python package for deduplication tasks leveraging RapidFuzz internally, enabling super-fast approximate duplicate detection within a dataset with minimal config.
☆18Apr 2, 2025Updated last year
Alternatives and similar repositories for fast-dedupe
Users that are interested in fast-dedupe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Personal Knowledge Graph - User Memory and Personality from Digital Footprint☆24Mar 12, 2026Updated last month
- synthetic data for ml☆25Jan 30, 2025Updated last year
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba