Data and code for "Fast Data Applications with Spark and Python"
☆25Sep 11, 2016Updated 9 years ago
Alternatives and similar repositories for spark-workshop
Users that are interested in spark-workshop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code & Data for Introduction to Machine Learning with Scikit-Learn☆81Sep 7, 2018Updated 7 years ago
- Minimum Entropy is a DDL hosted question/answer site for beginners who need answers to Data Science questions.☆16Jul 11, 2016Updated 9 years ago
- Code and Notebooks for the Natural Language Processing with Python course.☆64Dec 3, 2017Updated 8 years ago
- Graph extraction and NLP analysis for Baleen Corpora☆18Sep 8, 2016Updated 9 years ago
- An Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project…☆30May 31, 2016Updated 10 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Tool for visualizing the estimate of number of TNC (Uber and Lyft) pickups and dropoffs in San Francisco—by location and by time of day.☆18Apr 28, 2022Updated 4 years ago
- High Level Kafka Scanner☆19Sep 29, 2017Updated 8 years ago
- Dirichlet process mixture model (DPMM) for datamicroscopes☆14Oct 9, 2015Updated 10 years ago
- A web application that identifies party in political discourse and an example of operationalized machine learning.☆29Aug 17, 2018Updated 7 years ago
- Code & data for Fast data processing with Spark V2☆14Feb 1, 2015Updated 11 years ago
- Coding exercises for Apache Spark☆103Jun 4, 2015Updated 11 years ago
- Workshop: Python for Data Science☆64Nov 24, 2014Updated 11 years ago
- Tribe extracts a network from an email mbox and writes it to a graphml file for visualization and analysis.☆81Apr 15, 2023Updated 3 years ago
- Amazon access control challenge☆25Jun 21, 2014Updated 11 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Spark 2.0 Python Machine Learning examples☆99Oct 7, 2019Updated 6 years ago
- Code for the "Burn CPU, burn" competition at Kaggle. Uses Extreme Learning Machines and hyperopt.☆33Jun 25, 2014Updated 11 years ago
- Assignments of CS100.1x, Introduction to Big Data with Apache Spark☆18Jun 29, 2015Updated 10 years ago
- Scripts to Analyze Pronto's Data Release☆22Nov 12, 2015Updated 10 years ago
- Language Modeling with Sum-Product Networks☆20Jul 29, 2014Updated 11 years ago
- Code to accompany the paper "k-Stochastic Neighbor Embeddings for Supervised and Unsupervised Learning, ICML 2013".☆27Jun 8, 2016Updated 10 years ago
- Awk-like tool using python☆11Aug 4, 2020Updated 5 years ago
- Source code for 'Pro Hadoop Data Analytics' by Kerry Koitzsch☆14Jul 6, 2023Updated 2 years ago
- Tutorial on parsing Enron email to Avro and then explore the email set using Spark.☆52Mar 25, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆41Jul 24, 2015Updated 10 years ago
- Assignments of CS190.1x, Scalable Machine Learning☆18Aug 2, 2015Updated 10 years ago
- Multidimensional data explorer and visualization tool.☆55May 23, 2017Updated 9 years ago
- python utilities for Open Civic Data☆38Nov 11, 2024Updated last year
- Repo for Pivotal samples☆35Mar 24, 2022Updated 4 years ago
- Solution code from my winning submission to Kaggle's PyCon 2015 competition☆55Apr 9, 2015Updated 11 years ago
- A Python CLI game and library for Tic-tac-toe.☆10Apr 4, 2017Updated 9 years ago
- Tabula Rasa Tic-Tac-Toe☆10Jan 3, 2019Updated 7 years ago
- A follower to litable☆15Jul 22, 2016Updated 9 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A simple command line tool that will automatically invest all cash that becomes available☆47Apr 19, 2016Updated 10 years ago
- Cayley Dickson algebra implementation in python☆12Jan 3, 2019Updated 7 years ago
- A Bot For EMS☆12Mar 19, 2015Updated 11 years ago
- Oracle Data Science Bootcamp 2014☆24Apr 8, 2015Updated 11 years ago
- Do you even science, bro? Using RNN's to predict scientific titles.☆14Jun 5, 2017Updated 9 years ago
- Solution to the Higgs Boson Machine Learning Challenge on Kaggle☆32Sep 16, 2014Updated 11 years ago
- rddapp: Regression Discontinuity Design Application☆12Sep 2, 2025Updated 9 months ago