holdenk / diversity-analytics
Analytics on Apache Projects for Diversity
☆18Updated 5 years ago
Alternatives and similar repositories for diversity-analytics:
Users that are interested in diversity-analytics are comparing it to the libraries listed below
- A simple introduction to using spark ml pipelines☆26Updated 7 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- A simple example of containerized data science with python and Docker.☆51Updated 7 years ago
- A couple projects using scikit-learn illustrating project decision making.☆15Updated 8 years ago
- Sharing interesting and noteworthy Data Engineering content☆67Updated 8 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 8 years ago
- ☆26Updated last year
- Natural Language Processing with Spark's MLlib☆62Updated 7 years ago
- An example PySpark project with pytest☆17Updated 7 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Sample repo for luigi tasks & config☆36Updated 8 years ago
- Spoken dialogue querying for SQL databases.☆37Updated 8 years ago
- A short guide for transitioning from Python to Scala☆65Updated 9 years ago
- My talk at Strata 2014 in Santa Clara, CA☆73Updated 11 years ago
- Data and code for "Fast Data Applications with Spark and Python"☆25Updated 8 years ago
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- Materials fort Strata NYC 2016 scikit-learn tutorial☆15Updated 8 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆66Updated 9 years ago
- CLI tool to launch Spark jobs on AWS EMR☆67Updated last year
- PySpark phonetic and string matching algorithms☆39Updated last year
- Complete Pipeline Training at Big Data Scala By the Bay☆71Updated 9 years ago
- ☆16Updated 4 years ago
- Conversion utility from Zeppelin notes to Jupyter notebooks.☆44Updated 5 years ago
- 💥 Browser-based slides or PDFs of our talks and presentations☆94Updated 6 years ago
- ☆11Updated 6 years ago
- Spark Tutorial at the University of Maryland☆38Updated 10 years ago
- Simple Spark example of generating table stats for use of data quality checks☆28Updated 7 years ago
- Analyze the structure and dynamics of an open source project's developer community, using graph algorithms, etc.☆58Updated 4 years ago
- Pydata Seattle 2015 Trend Estimation in Time Series Signals Deck + Notebooks☆21Updated 9 years ago
- Snippets of code used in blog posts and other media.☆13Updated this week