cleanzr / fasthash
Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).
☆14Updated 5 years ago
Related projects: ⓘ
- Python wrapper for a C++ Double Metaphone☆15Updated last year
- A maximum-strength name parser for record linkage.☆29Updated last month
- ☆13Updated 5 years ago
- MPEDS Annotation Interface☆18Updated 2 years ago
- R package for Multisource Embeddings for Medical Records☆17Updated 3 years ago
- R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance☆12Updated 7 years ago
- A browser user interface for manual labeling of record pairs.☆41Updated last year
- Perform Bayesian record linkage with a one-to-one matching assumption.☆11Updated 4 years ago
- Egonet is a program for the collection and analysis of egocentric network data. It helps you create the questionnaire, collect data, and …☆22Updated 2 years ago
- Compare accuracies of udpipe models and spacy models which can be used for NLP annotation☆14Updated 6 years ago
- Various functions to make bag-of-words approaches to text analysis more user-friendly☆25Updated 7 years ago
- MoodCat😼 classifies the mood of English sentences.☆13Updated 2 years ago
- Regular Expression Counts of Terms and Substrings☆25Updated 2 years ago
- Visual analytics application for qualitative text analysis☆24Updated last year
- Encode Categorical Features (unmaintained)☆32Updated last year
- Do things with words. Scale them, mostly.☆17Updated 3 years ago
- Model verification, validation, and error analysis☆58Updated 8 months ago
- ☆21Updated this week
- A text processing pipeline for turning unstructured text data into hierarchical datasets☆13Updated 4 years ago
- IWAAN - An interactive Jupyter Notebook collection that allows to run analyses of Wikipedia article editing dynamics out-of-the-box on Bi…☆9Updated 4 months ago
- Classify names by gender, U.S. ethnicity, or leaf nationality☆19Updated 5 years ago
- A method for estimating causal effects in time-series data. Uses available data to automatically find natural experiments for identifying…☆15Updated 4 years ago
- Programmatic interface to the Arxiv API☆60Updated 6 months ago
- Selective Bayesian Forest Classifier - R package for simultaneous feature selection and classification. See paper: http://arxiv.org/abs/1…☆16Updated 2 years ago
- Data Scientist code test☆19Updated 4 years ago
- DreamBank Visualized - An interactive visualization of over 26,000 dream transcriptions☆15Updated 6 years ago
- Text Interchange Formats☆35Updated 9 months ago
- An R Package for Text Analysis☆45Updated last year
- Interactive visualization for time series forecasts☆23Updated 3 years ago
- This code is to demonstrate the use of esquisse to generate ggplot2 with drag and drop☆9Updated 5 years ago