dell-research-harvard / linktransformerLinks
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
☆133Updated 2 months ago
Alternatives and similar repositories for linktransformer
Users that are interested in linktransformer are comparing it to the libraries listed below
Sorting:
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆89Updated last month
- Tool for probabilistically linking the records of individual entities (e.g. people) within and across datasets☆118Updated 2 months ago
- An End-to-End Evaluation Framework for Entity Resolution Systems☆36Updated 2 years ago
- Powerful topic model visualization in Python☆139Updated 10 months ago
- Nesta's Skills Extractor Library☆150Updated 7 months ago
- Python package for text mining of time-series data☆76Updated 8 months ago
- ☆54Updated last month
- Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.☆74Updated last year
- Zero/few shot learning components for scikit-learn pipelines with LLMs and transformers.☆18Updated last year
- Tools for interactive visual exploration of semantic embeddings.☆41Updated last year
- Embedding Vector Oriented Clustering☆165Updated this week
- Google Trends, made easy.☆117Updated last year
- Innovation across ages☆72Updated 2 years ago
- Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM (CHI 2024 paper). LLooM automatically surfaces high-l…☆147Updated 7 months ago
- List of entity resolution software and resources.☆103Updated 10 months ago
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆198Updated 7 months ago
- Name matching is a Python package for the matching of company names. This package has been developed to match the names of companies from…☆161Updated last month
- Robust and fast topic models with sentence-transformers.☆88Updated last month
- Fast, flexible name matching for large datasets☆71Updated 4 months ago
- MoodCat😼 classifies the mood of English sentences.☆14Updated 3 years ago
- HDBSCAN Tuning for BERTopic Models☆49Updated 2 years ago
- Course repository for the session "Hands-on Transformers: Fine-Tune your own BERT and GPT" of the Data Science Summer School 2023☆89Updated 2 years ago
- Interactive notebooks containing demonstration code of the splink library☆40Updated 2 years ago
- ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence (NAACL 2022)☆43Updated 2 months ago
- The Harvard USPTO Patent Dataset☆80Updated 2 years ago
- A python package to enrich Twitter Data☆75Updated 2 years ago
- A Python client for the GDELT 2.0 Doc API☆178Updated 8 months ago
- code base for constructing narrative statements from text☆116Updated last week
- ☆71Updated 8 months ago
- Select, weight and analyze complex sample data☆72Updated last week