cleanzr / fasthash
Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).
☆14Updated 6 years ago
Alternatives and similar repositories for fasthash:
Users that are interested in fasthash are comparing it to the libraries listed below
- Python wrapper for a C++ Double Metaphone☆15Updated 2 years ago
- R package for Multisource Embeddings for Medical Records☆17Updated 3 years ago
- ☆13Updated 6 years ago
- Perform Bayesian record linkage with a one-to-one matching assumption.☆11Updated 4 years ago
- A maximum-strength name parser for record linkage.☆37Updated this week
- Data Scientist code test☆19Updated 4 years ago
- MPEDS Annotation Interface☆18Updated 2 years ago
- A browser user interface for manual labeling of record pairs.☆47Updated last year
- R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance☆12Updated 8 years ago
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- this repo contains the draft, images, and code for the Medium blog post on altair themes.☆12Updated 6 years ago
- Regular Expression Counts of Terms and Substrings☆25Updated 3 years ago
- Compare accuracies of udpipe models and spacy models which can be used for NLP annotation☆14Updated 7 years ago
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 3 years ago
- Do things with words. Scale them, mostly.☆17Updated 3 years ago
- AWS IAM Client Package☆14Updated 4 years ago
- Library to read a subset of Parquet files☆43Updated 5 years ago
- Various functions to make bag-of-words approaches to text analysis more user-friendly☆24Updated 8 years ago
- The privacy-preserving record linkage toolkit: a proof-of-concept public demo of next-gen data linkage techniques.☆10Updated 11 months ago
- Encode Categorical Features (unmaintained)☆32Updated 2 years ago
- motivational website to do something special this month☆21Updated last year
- qdapTools is an R package that contains tools associated with the qdap package that may be useful outside of the context of text analysis…☆16Updated last year
- Visualisation for statistical models.☆20Updated 6 years ago
- Datakit plugin to help manage Github integration on data projects.☆12Updated 2 years ago
- A visual analysis tool for exploring multiverse outcomes☆30Updated 3 years ago
- Slideshow template for Voilà based on RevealJS☆16Updated 3 years ago
- R tools for GDELT and the Global Knowledge Graph☆14Updated 11 years ago
- MoodCat😼 classifies the mood of English sentences.☆14Updated 2 years ago
- Visualize uncertainty☆28Updated 2 years ago
- Model verification, validation, and error analysis☆58Updated last year