trevorprater / serf
Stanford Entity-Resolution Framework
☆23Updated 6 years ago
Related projects: ⓘ
- SmallK: very fast data clustering tools☆14Updated 5 years ago
- variations of the record linkage model of Steorts et al. AISTATS 2014's "SMERED: A Bayesian Approach to Graphical Record Linkage and De-d…☆27Updated 7 years ago
- ☆20Updated 7 years ago
- Vizlinc☆14Updated 8 years ago
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 3 years ago
- Algorithms for "schema matching"☆25Updated 8 years ago
- A Generalized Data Cleaning System☆47Updated 8 years ago
- Pattern-of-Behavior Search Tool☆11Updated 2 years ago
- ☆20Updated 7 years ago
- Collection of some algorithms for entity resolution☆28Updated 9 years ago
- Library for Geo-Inferencing in Twitter Data☆28Updated 8 years ago
- open source version of the Bonsai library☆26Updated 8 years ago
- MetroMaps Release☆16Updated 10 years ago
- System for mining Wikipedia Usage data to read our collective mind☆21Updated 9 years ago
- ☆38Updated 8 years ago
- A book on the applications of topic models.☆14Updated 7 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆70Updated 4 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆21Updated 8 years ago
- Binding the GDELT universe in a Spark environment☆22Updated last year
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆14Updated 9 years ago
- Compiler for writing DeepDive applications in a Datalog-like language — ⚠️🚧🛑 REPO MOVED TO DEEPDIVE 👇🏿☆19Updated 7 years ago
- R tools for GDELT and the Global Knowledge Graph☆14Updated 10 years ago
- [hibernating] Dynamic topic models☆39Updated 9 years ago
- Tutorial code and data for the entity resolution workshops.☆45Updated 9 years ago
- DBpedia Distributed Extraction Framework: Extract structured data from Wikipedia in a parallel, distributed manner☆41Updated 2 years ago
- Fork of the Freely Extensible Biomedical Record Linkage program☆23Updated 7 years ago
- Using Word2Vec on lists and sets☆34Updated 8 years ago
- Code for Sentiment Analysis Symposium tutorial demos.☆15Updated 7 years ago
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- Library for building reproducible data pipelines to support experimentation☆20Updated 8 years ago