Scalable String Similarity Joins in Python
☆38Jul 12, 2024Updated last year
Alternatives and similar repositories for py_stringsimjoin
Users that are interested in py_stringsimjoin are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A comprehensive and scalable set of string tokenizers and similarity measures in Python☆144Feb 18, 2026Updated 3 months ago
- ☆192May 29, 2024Updated 2 years ago
- Hidden alignment conditional random field for classifying string pairs.☆36Sep 6, 2017Updated 8 years ago
- Python package for performing Entity and Text Matching using Deep Learning.☆619Jun 18, 2024Updated last year
- Learned string similarity for entity names using optimal transport.☆35Nov 17, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code for extracting data from a large number of PDFs, particularly FCC political ad documents☆15Oct 26, 2017Updated 8 years ago
- Build-to-Order BLAS☆12Apr 9, 2019Updated 7 years ago
- Approximate and vectorized versions of common mathematical functions☆13Mar 1, 2017Updated 9 years ago
- generic extraction recipes to get you started extracting schema.org entities for your software, data, and all things☆14Apr 6, 2019Updated 7 years ago
- ☆16Jan 7, 2021Updated 5 years ago
- This is the implementation of word aligner using Hidden Markov Model☆10Jun 24, 2019Updated 6 years ago
- The Ethereum Canvas☆10Oct 19, 2017Updated 8 years ago
- Code to reproduce experiments appearing in the academic paper Lost Relatives of the Gumbel Trick☆17Jun 14, 2017Updated 9 years ago
- This repository contains the code and data download links to reproduce the experiments of the PVLDB paper "Dual-Objective Fine-Tuning of …☆16Jun 7, 2021Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Suite of tools for game developers building on MUD☆12Mar 13, 2024Updated 2 years ago
- Python tools to build, do inference with, and learn undirected graphical models.☆14Mar 25, 2019Updated 7 years ago
- Optimally-weighted herding is Bayesian Quadrature☆17Jul 8, 2016Updated 9 years ago
- Dyna built on R-exprs (First Prototype)☆17Mar 7, 2022Updated 4 years ago
- A GitBook about creating a GitBook for teaching☆10Apr 21, 2020Updated 6 years ago
- linear-time dynamic programming dependency parser☆11Feb 2, 2019Updated 7 years ago
- Visualizations of character embeddings from derived character vectors.☆13Apr 4, 2017Updated 9 years ago
- A Deep Learning based project for colorizing and restoring old images (and video!)☆23Jul 21, 2020Updated 5 years ago
- Geopandas and Shapely☆10Jul 29, 2018Updated 7 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆14Feb 11, 2026Updated 4 months ago
- ☆24May 5, 2026Updated last month
- A list of free data matching and record linkage software.☆406Feb 21, 2024Updated 2 years ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,052Feb 21, 2024Updated 2 years ago
- Supplementary code for "Name2Vec: Personal Names Embeddings" presented at The Canadian Conference on AI 2019.☆18Jun 25, 2020Updated 5 years ago
- A ChatGPT plugin for Solana☆13Jun 1, 2023Updated 3 years ago
- A CoroutineExecutor for asyncio, similar to nurseries and task groups☆13Aug 20, 2022Updated 3 years ago
- A bunch of tools for automating parts of a Systematic Review of scientific literature☆14Sep 16, 2020Updated 5 years ago
- Functional interface for concurrent futures, including async coroutines.☆11Apr 14, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Make your research data and code FAIR with the UU FAIR Cheatsheets!☆17Apr 10, 2024Updated 2 years ago
- Split a JSON file with hierarchical data to multiple CSV files☆28Apr 9, 2023Updated 3 years ago
- A curated list of awesome Citizen Science Projects in the Netherlands☆20May 4, 2021Updated 5 years ago
- Template repo for a GCP-hosted REST API with automatic API versioning and custom domain mapping☆17Jul 25, 2023Updated 2 years ago
- JSON Schema Validition for the Soccer Common Data Format☆16Mar 19, 2026Updated 2 months ago
- Code to create and visualize demographic clusters for the US with data from the American Community Survey☆25Mar 28, 2022Updated 4 years ago
- R package for handling, checking and enforcing data rules☆22Nov 28, 2025Updated 6 months ago