Data and code for "Fast Data Applications with Spark and Python"
☆25Sep 11, 2016Updated 9 years ago
Alternatives and similar repositories for spark-workshop
Users that are interested in spark-workshop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Play with the Spark, Spark streaming and DataFrame API.☆12Jun 26, 2015Updated 10 years ago
- Minimum Entropy is a DDL hosted question/answer site for beginners who need answers to Data Science questions.☆16Jul 11, 2016Updated 9 years ago
- Code and Notebooks for the Natural Language Processing with Python course.☆64Dec 3, 2017Updated 8 years ago
- Graph extraction and NLP analysis for Baleen Corpora☆18Sep 8, 2016Updated 9 years ago
- An Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project…☆30May 31, 2016Updated 9 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- High Level Kafka Scanner☆19Sep 29, 2017Updated 8 years ago
- Dirichlet process mixture model (DPMM) for datamicroscopes☆14Oct 9, 2015Updated 10 years ago
- A web application that identifies party in political discourse and an example of operationalized machine learning.☆29Aug 17, 2018Updated 7 years ago
- Coding exercises for Apache Spark☆104Jun 4, 2015Updated 10 years ago
- Workshop: Python for Data Science☆64Nov 24, 2014Updated 11 years ago
- Tribe extracts a network from an email mbox and writes it to a graphml file for visualization and analysis.☆81Apr 15, 2023Updated 3 years ago
- Parallel Genomic Analysis Toolkit☆14Feb 11, 2019Updated 7 years ago
- Amazon access control challenge☆25Jun 21, 2014Updated 11 years ago
- Code for the "Burn CPU, burn" competition at Kaggle. Uses Extreme Learning Machines and hyperopt.☆33Jun 25, 2014Updated 11 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- AWS, Vagrant, and Spark☆21Nov 10, 2015Updated 10 years ago
- Language Modeling with Sum-Product Networks☆20Jul 29, 2014Updated 11 years ago
- Code to accompany the paper "k-Stochastic Neighbor Embeddings for Supervised and Unsupervised Learning, ICML 2013".☆27Jun 8, 2016Updated 9 years ago
- Source code for 'Pro Hadoop Data Analytics' by Kerry Koitzsch☆14Jul 6, 2023Updated 2 years ago
- Multidimensional data explorer and visualization tool.☆55May 23, 2017Updated 8 years ago
- Creating user interfaces for data science with Jupyter widgets☆11Oct 28, 2017Updated 8 years ago
- rddapp: Regression Discontinuity Design Application☆11Sep 2, 2025Updated 8 months ago
- Solution code from my winning submission to Kaggle's PyCon 2015 competition☆55Apr 9, 2015Updated 11 years ago
- A Python CLI game and library for Tic-tac-toe.☆10Apr 4, 2017Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A follower to litable☆15Jul 22, 2016Updated 9 years ago
- ☆48May 11, 2016Updated 9 years ago
- Spark example of collecting tweets and loading into HDFS/S3☆42Oct 2, 2013Updated 12 years ago
- Cayley Dickson algebra implementation in python☆12Jan 3, 2019Updated 7 years ago
- My collection of resources related to R☆18Feb 4, 2020Updated 6 years ago
- Generating the next read for our book club- with Data Science!☆39Feb 6, 2016Updated 10 years ago
- Do you even science, bro? Using RNN's to predict scientific titles.☆14Jun 5, 2017Updated 8 years ago
- My dotfiles☆12Apr 21, 2026Updated last week
- Solution to the Higgs Boson Machine Learning Challenge on Kaggle☆32Sep 16, 2014Updated 11 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Public Presentations☆24Apr 13, 2025Updated last year
- Ecological Niche Modelling Using Deep Learning☆17Jul 1, 2019Updated 6 years ago
- Makes waffle plot (GitHub style) of Reviewer's activity.☆10Oct 6, 2017Updated 8 years ago
- adaptive replacement cache☆36Mar 13, 2018Updated 8 years ago
- ☆18Sep 7, 2014Updated 11 years ago
- Code for Max-Margin Deep Generative Models☆12Jan 1, 2015Updated 11 years ago
- Some Python to process the Wikileaks Cablegate data.☆19Nov 30, 2010Updated 15 years ago