manuparra / starting-bigdata-awsLinks
☆25Updated 4 years ago
Alternatives and similar repositories for starting-bigdata-aws
Users that are interested in starting-bigdata-aws are comparing it to the libraries listed below
Sorting:
- Tweet Analysis with Spark☆15Updated 8 years ago
- Labs and data files for a full-day Spark workshop☆24Updated 8 months ago
- Ingest tweets with Kafka. Use Spark to track popular hashtags and trendsetters for each hashtag☆29Updated 9 years ago
- Apache Spark docker container image (Standalone mode)☆35Updated 5 years ago
- Real-world Spark pipelines examples☆83Updated 7 years ago
- Code examples and docker environment for Spark☆28Updated 9 years ago
- Reference Graph Gists☆45Updated 5 years ago
- Making Machine Learning Simple and Scalable with Python, Jupyter Notebook, TensorFlow, Keras, Apache Kafka and KSQL☆97Updated 7 years ago
- A tutorial on how to get started with Presto.☆55Updated 4 years ago
- Use Kafka and Apache Spark streaming to perform click stream analytics☆76Updated 5 years ago
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples☆72Updated 5 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 3 years ago
- PySpark phonetic and string matching algorithms☆41Updated last year
- How to do data science with Optimus, Spark and Python.☆19Updated 6 years ago
- Big Data Demystified meetup and blog examples☆31Updated last year
- ☆19Updated 8 years ago
- Code from the book Machine Learning Systems☆145Updated 7 years ago
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆53Updated 9 years ago
- A simple introduction to using spark ml pipelines☆26Updated 7 years ago
- Weekly Data Engineering Newsletter☆96Updated last year
- 🚨 Simple, self-contained fraud detection system built with Apache Kafka and Python☆89Updated 6 years ago
- Code and setup information for Introduction to Machine Learning with Spark☆12Updated 10 years ago
- An example PySpark project with pytest☆18Updated 8 years ago
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 8 years ago
- Various Demos mostly based on docker environments☆33Updated 3 years ago
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated 2 years ago
- Mastering Spark for Data Science, published by Packt☆49Updated 3 years ago
- Analysis of City Of Chicago Taxi Trip Dataset Using AWS EMR, Spark, PySpark, Zeppelin and Airbnb's Superset☆15Updated 8 years ago
- ☆12Updated 7 years ago
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆52Updated 4 years ago