Scalable String Similarity Joins in Python
☆39Jul 12, 2024Updated last year
Alternatives and similar repositories for py_stringsimjoin
Users that are interested in py_stringsimjoin are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A comprehensive and scalable set of string tokenizers and similarity measures in Python☆144Feb 18, 2026Updated last month
- ☆192May 29, 2024Updated last year
- Hidden alignment conditional random field for classifying string pairs.☆36Sep 6, 2017Updated 8 years ago
- Uses NLP methods to parse and classify contracts from The City of New Orleans☆10Mar 23, 2015Updated 11 years ago
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Implementation of Shake-Shake by chainer (Shake-Shake regularization of 3-branch residual networks: https://openreview.net/forum?id=HkO-P…☆10Aug 24, 2017Updated 8 years ago
- Asynchronous financial data management☆22Oct 3, 2017Updated 8 years ago
- Approximate and vectorized versions of common mathematical functions☆13Mar 1, 2017Updated 9 years ago
- A Rete-based, CLIPS-clone, inference engine in Python.☆19Feb 4, 2013Updated 13 years ago
- generic extraction recipes to get you started extracting schema.org entities for your software, data, and all things☆14Apr 6, 2019Updated 7 years ago
- ☆16Mar 4, 2026Updated last month
- This is the implementation of word aligner using Hidden Markov Model☆10Jun 24, 2019Updated 6 years ago
- Chu-Lui-Edmonds decoding extracted from TurboParser☆14May 16, 2017Updated 8 years ago
- Machine Learning Deployment for Kubernetes☆19Dec 7, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Data Scientist code test☆19Jul 2, 2020Updated 5 years ago
- Code to reproduce experiments appearing in the academic paper Lost Relatives of the Gumbel Trick☆17Jun 14, 2017Updated 8 years ago
- ☆13Dec 8, 2022Updated 3 years ago
- Suite of tools for game developers building on MUD☆12Mar 13, 2024Updated 2 years ago
- Grapheme to phoneme toolkit using joint-modelling + CRFs in java☆14Jul 14, 2018Updated 7 years ago
- Automated data extraction from U.S. state Comprehensive Annual Financial Reports (CAFR).☆16Feb 27, 2022Updated 4 years ago
- A Python package for efficient evaluation based on OASIS (Optimal Asymptotic Sequential Importance Sampling).☆15Jun 4, 2021Updated 4 years ago
- FlexMatcher is a schema matching package in Python which handles the problem of matching multiple schemas to a single mediated schema.☆30Dec 6, 2024Updated last year
- A fast implementation of GloVe, with optional retrofitting☆12Apr 16, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A pipeline for automated mapping of aggregate racial/ancestral groups - based on a 1976 map of Chicago☆21Oct 17, 2017Updated 8 years ago
- Learning to Prune: Exploring the Frontier of Fast and Accurate Parsing☆22Sep 24, 2024Updated last year
- linear-time dynamic programming dependency parser☆11Feb 2, 2019Updated 7 years ago
- Jupyter notebook on Gumbel-max and Gumbel-softmax tricks☆40Nov 11, 2022Updated 3 years ago
- Visualizations of character embeddings from derived character vectors.☆13Apr 4, 2017Updated 9 years ago
- Geopandas and Shapely☆10Jul 29, 2018Updated 7 years ago
- A list of free data matching and record linkage software.☆403Feb 21, 2024Updated 2 years ago
- ☆25Aug 20, 2025Updated 7 months ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,049Feb 21, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A scheduler to manage a multi tool dual arm robot while avoiding arm-to-arm collisions; considering complex side constraints; and optimiz…☆11Jul 6, 2021Updated 4 years ago
- A collection of Python scripts☆12Feb 7, 2020Updated 6 years ago
- Supplementary code for "Name2Vec: Personal Names Embeddings" presented at The Canadian Conference on AI 2019.☆18Jun 25, 2020Updated 5 years ago
- A CoroutineExecutor for asyncio, similar to nurseries and task groups☆13Aug 20, 2022Updated 3 years ago
- Entitypedia is an Extended Named Entity Dictionary from Wikipedia.☆13Dec 7, 2022Updated 3 years ago
- A javascript implementation of limited-memory BFGS☆26May 25, 2017Updated 8 years ago
- Bart vs. Homer recognition task to spot and fix data leakage.☆25Nov 22, 2018Updated 7 years ago