cleanzr / fasthash
Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).
☆14Updated 6 years ago
Alternatives and similar repositories for fasthash:
Users that are interested in fasthash are comparing it to the libraries listed below
- Python wrapper for a C++ Double Metaphone☆15Updated 2 years ago
- R package for Multisource Embeddings for Medical Records☆17Updated 3 years ago
- MPEDS Annotation Interface☆18Updated 2 years ago
- A maximum-strength name parser for record linkage.☆36Updated last week
- Regular Expression Counts of Terms and Substrings☆25Updated 3 years ago
- Do things with words. Scale them, mostly.☆17Updated 3 years ago
- Easy, fast clustering of texts☆18Updated 7 years ago
- Perform Bayesian record linkage with a one-to-one matching assumption.☆11Updated 4 years ago
- qdapTools is an R package that contains tools associated with the qdap package that may be useful outside of the context of text analysis…☆16Updated last year
- Summer School: Social Media and Big Data Research☆13Updated 6 years ago
- A text processing pipeline for turning unstructured text data into hierarchical datasets☆14Updated 4 years ago
- R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance☆12Updated 8 years ago
- Compare accuracies of udpipe models and spacy models which can be used for NLP annotation☆14Updated 7 years ago
- Various functions to make bag-of-words approaches to text analysis more user-friendly☆24Updated 8 years ago
- Lecture Slides for Introduction to Data Science☆25Updated 2 years ago
- ☆13Updated 5 years ago
- Encode Categorical Features (unmaintained)☆32Updated 2 years ago
- Text Interchange Formats☆37Updated last year
- R package to compute and visualize summary trees☆34Updated 9 years ago
- Slides and resources for flexdashboard talk at UseR! 2016☆11Updated 8 years ago
- Selective Bayesian Forest Classifier - R package for simultaneous feature selection and classification. See paper: http://arxiv.org/abs/1…☆16Updated 3 years ago
- R bindings to apache arrow☆31Updated 6 years ago
- Classify names by gender, U.S. ethnicity, or leaf nationality☆19Updated 6 years ago
- Implements the model described in "Identification, Interpretability, and Bayesian Word Embeddings"☆18Updated 5 years ago
- MoodCat😼 classifies the mood of English sentences.☆14Updated 2 years ago
- "Exploratory Data Analysis using Random Forests"☆18Updated 9 years ago
- IWAAN - An interactive Jupyter Notebook collection that allows to run analyses of Wikipedia article editing dynamics out-of-the-box on Bi…☆9Updated 11 months ago
- Interactive Network Graph Visualization for NDTV-generate graphs using D3 animation☆18Updated 9 years ago
- A collection of scripts for teaching and learning basic text mining methods in R☆10Updated 6 years ago
- An R Package for Text Analysis☆45Updated last year