dell-research-harvard / linktransformerLinks
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
☆121Updated 2 months ago
Alternatives and similar repositories for linktransformer
Users that are interested in linktransformer are comparing it to the libraries listed below
Sorting:
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆72Updated 4 months ago
- Innovation across ages☆70Updated 2 years ago
- Tool for probabilistically linking the records of individual entities (e.g. people) within and across datasets☆115Updated 6 months ago
- Code for measuring novelty in science using publication text☆29Updated 3 months ago
- ☆32Updated 2 months ago
- ☆43Updated last week
- An End-to-End Evaluation Framework for Entity Resolution Systems☆29Updated last year
- Name matching is a Python package for the matching of company names. This package has been developed to match the names of companies from…☆155Updated last month
- Evaluation and benchmarking of PatentsView disambiguation algorithms☆13Updated last year
- Nesta's Skills Extractor Library☆139Updated 2 weeks ago
- Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.☆74Updated last year
- Code for the paper "CAREER: Transfer Learning for Economic Prediction of Labor Sequence Data"☆43Updated last year
- code base for constructing narrative statements from text☆110Updated last year
- Noise-robust de-duplication at scale☆20Updated 2 years ago
- A tutorial on entity resolution (record linkage or de-duplication)☆63Updated 4 years ago
- Text-Based Ideal Points☆44Updated 2 years ago
- This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Enti…☆56Updated 8 months ago
- Python package for text mining of time-series data☆73Updated last month
- A python package to enrich Twitter Data☆75Updated 2 years ago
- An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.☆78Updated last month
- Fast, flexible name matching for large datasets☆72Updated last month
- A shared repository for data cleaning scripts used for innovation data.☆33Updated 4 years ago
- Select, weight and analyze complex sample data☆67Updated last month
- This repository contains the raw data, code, and sources used to create an individual level and state municipal incorporation date datase…☆24Updated 3 months ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆144Updated 8 months ago
- Google Trends, made easy.☆110Updated last year
- The Harvard USPTO Patent Dataset☆68Updated last year
- ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence (NAACL 2022)☆34Updated 2 months ago
- ☆22Updated 2 years ago
- Powerful topic model visualization in Python☆125Updated 3 months ago