dell-research-harvard / linktransformerLinks
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
☆127Updated 5 months ago
Alternatives and similar repositories for linktransformer
Users that are interested in linktransformer are comparing it to the libraries listed below
Sorting:
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆77Updated 6 months ago
- Tool for probabilistically linking the records of individual entities (e.g. people) within and across datasets☆117Updated 9 months ago
- Nesta's Skills Extractor Library☆141Updated 3 months ago
- Powerful topic model visualization in Python☆133Updated 6 months ago
- An End-to-End Evaluation Framework for Entity Resolution Systems☆31Updated last year
- Python package for text mining of time-series data☆76Updated 4 months ago
- Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.☆74Updated last year
- Tools for interactive visual exploration of semantic embeddings.☆38Updated last year
- ☆48Updated this week
- Code for the paper "CAREER: Transfer Learning for Economic Prediction of Labor Sequence Data"☆45Updated last year
- Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers…☆274Updated this week
- Innovation across ages☆71Updated 2 years ago
- Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM (CHI 2024 paper). LLooM automatically surfaces high-l…☆125Updated 3 months ago
- causal-falsify: A Python library with algorithms for falsifying unconfoundedness assumption in a composite dataset from multiple sources.☆35Updated 3 weeks ago
- ☆93Updated last year
- 🗺️ Data Cleaning and Textual Data Visualization 🗺 ️☆187Updated 3 months ago
- A python package to enrich Twitter Data☆75Updated 2 years ago
- Name matching is a Python package for the matching of company names. This package has been developed to match the names of companies from…☆158Updated last month
- Course repository for the session "Hands-on Transformers: Fine-Tune your own BERT and GPT" of the Data Science Summer School 2023☆88Updated 2 years ago
- This repository contains the raw data, code, and sources used to create an individual level and state municipal incorporation date datase…☆25Updated 6 months ago
- ☆55Updated last year
- Fast, flexible name matching for large datasets☆72Updated 3 weeks ago
- Prototype search engine for ONS bulletins☆24Updated last year
- List of entity resolution software and resources.☆84Updated 6 months ago
- ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence (NAACL 2022)☆38Updated 3 weeks ago
- Embedding Vector Oriented Clustering☆153Updated 3 weeks ago
- Zero/few shot learning components for scikit-learn pipelines with LLMs and transformers.☆18Updated 9 months ago
- Google Trends, made easy.☆113Updated last year
- A BERT-based application for reusable text classification at scale☆38Updated 2 years ago
- ☆70Updated last week