Welcome to Snowman App – a Data Matching Benchmark Platform.
☆38Feb 9, 2023Updated 3 years ago
Alternatives and similar repositories for snowman
Users that are interested in snowman are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Detect dominant periodicity in equidistant time series☆23Updated this week
- This repository provides the implementation of several well-know INDs discovery algorithms☆13Nov 5, 2019Updated 6 years ago
- The dataset for the paper "Machamp: A Generalized Entity Matching Benchmark" published in CIKM 2021☆21Oct 18, 2021Updated 4 years ago
- LEMON: Explainable Entity Matching☆19Apr 6, 2022Updated 4 years ago
- Code for the paper "Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond…☆24May 31, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Opensource scraper for analyse of social networks. Create nodes with egdes for you to visualize on editors like gephi.☆11Dec 2, 2025Updated 6 months ago
- Lab tasks for the course on "Data Engineering for Machine Learning"☆10May 1, 2023Updated 3 years ago
- A .NET library to work with Electronic Product Codes (EPC, SSCC, SGTIN)☆12Jun 25, 2020Updated 5 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆67Updated this week
- Minoan ER is an Entity Resolution (ER) framework, built by researchers in Crete (the land of the ancient Minoan civilization). Entity res…☆18Nov 18, 2020Updated 5 years ago
- Federal Cloud Computing Strategy Website☆15Oct 6, 2022Updated 3 years ago
- An open source, high scalability toolkit in Java for Entity Resolution.☆224Jul 12, 2025Updated 11 months ago
- JedAI-WebApp is a GUI that facilitates the execution of JedAI. JedAI is an open source, high scalability toolkit that offers out-of-the-b…☆26Apr 14, 2023Updated 3 years ago
- Resources for tackling record linkage / deduplication / data matching problems☆127Feb 22, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆192May 29, 2024Updated 2 years ago
- FairPrep is a design and evaluation framework for fairness-enhancing interventions that treats data as a first-class citizen.☆11Mar 24, 2023Updated 3 years ago
- Navigating around a grid of cells like XPath for spreadsheets; supports Python 3.5+☆49Feb 1, 2023Updated 3 years ago
- Repository for performing Blocking using Deep Learning based on the paper "Deep Learning for Blocking in Entity Matching: A Design Space …☆30Apr 5, 2023Updated 3 years ago
- Ensime integration with Sublime Text 2 for Scala development☆139Jul 8, 2015Updated 10 years ago
- Clustering documents based on LSH☆14Apr 20, 2016Updated 10 years ago
- 电商爬虫,一个用于收集商品图片和信息的爬虫项目。A scrapy project for crawl product pictures and informations.☆11May 14, 2024Updated 2 years ago
- Datasets for Hyperparameter Optimization of Neural Machine Translation☆10Aug 19, 2024Updated last year
- Iocaine2 Tool for FFXI☆10May 9, 2022Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.☆23Aug 24, 2023Updated 2 years ago
- A Generalized Data Cleaning System☆52Apr 28, 2016Updated 10 years ago
- A list of free data matching and record linkage software.☆406Feb 21, 2024Updated 2 years ago
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Nov 18, 2022Updated 3 years ago
- ☆15Dec 28, 2023Updated 2 years ago
- TeamViewer QuickSupport Integration for .net applications☆11Jan 20, 2022Updated 4 years ago
- coloring terminal text with intensities (used for plotting probability, entropy with tokens)☆12Oct 11, 2024Updated last year
- Pattern-based table discovery in Open Data CSV files☆25Dec 8, 2022Updated 3 years ago
- [VLDB 2024] Source code for FusionQuery: On-demand Fusion Queries over Multi-source Heterogeneous Data☆11Mar 11, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Unofficial implementation of the paper "OpenTag: Open Attribute Value Extraction from Product Profiles"☆33Aug 22, 2018Updated 7 years ago
- High-level Rust library that binds to Poppler to extract text from a PDF☆11Dec 16, 2020Updated 5 years ago
- Source code for several Metanome data profiling algorithms☆58May 15, 2023Updated 3 years ago
- Framework for visualizing the output from any text-based command-line utility☆25Mar 14, 2024Updated 2 years ago
- Code for the paper "CollaborEM: A Self-supervised Entity Matching Framework Using Multi-features Collaboration". TKDE 2021.☆41Jul 12, 2022Updated 3 years ago
- ☆13Jan 1, 2024Updated 2 years ago
- LIDA: Lightweight Interactive Dialogue Annotator (in EMNLP 2019)☆10Oct 18, 2021Updated 4 years ago