dell-research-harvard / linktransformer
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
☆118Updated last month
Alternatives and similar repositories for linktransformer:
Users that are interested in linktransformer are comparing it to the libraries listed below
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆67Updated 3 weeks ago
- Tool for probabilistically linking the records of individual entities (e.g. people) within and across datasets☆111Updated 3 months ago
- Innovation across ages☆69Updated 2 years ago
- ☆30Updated 2 months ago
- Replication code for https://www.john-joseph-horton.com/papers/llm_ask.pdf☆34Updated last year
- Powerful topic model visualization in Python☆118Updated this week
- Tools for interactive visual exploration of semantic embeddings.☆32Updated 6 months ago
- ☆80Updated 9 months ago
- code base for constructing narrative statements from text☆107Updated last year
- Code for measuring novelty in science using publication text☆24Updated 2 weeks ago
- ☆36Updated 3 weeks ago
- Python package for text mining of time-series data☆71Updated 3 months ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆144Updated 5 months ago
- Fast, flexible name matching for large datasets☆71Updated last year
- Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.☆73Updated 8 months ago
- Nesta's Skills Extractor Library☆129Updated 4 months ago
- PatentSBERTa: A Deep NLP based Hybrid Model for Patent Distance and Classification using Augmented SBERT☆81Updated 4 months ago
- Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers…☆225Updated this week
- difference-in-differences in Python☆100Updated last year
- Noise-robust de-duplication at scale☆18Updated last year
- An End-to-End Evaluation Framework for Entity Resolution Systems☆27Updated last year
- A BERT-based application for reusable text classification at scale☆38Updated last year
- The official Github for the American Stories dataset as in {link}☆115Updated last year
- Partition selection, point estimation, pointwise and uniform inference, and graphical procedures using binscatter methods.☆41Updated 2 months ago
- A python package to enrich Twitter Data☆75Updated last year
- Python libraries to call UN Comtrade APIs☆71Updated 11 months ago
- ☆54Updated last year
- Google Trends, made easy.☆104Updated 9 months ago
- Code for the paper "CAREER: Transfer Learning for Economic Prediction of Labor Sequence Data"☆39Updated 9 months ago
- Embedding Vector Oriented Clustering☆133Updated 3 weeks ago