manuparra / starting-bigdata-awsLinks
☆25Updated 4 years ago
Alternatives and similar repositories for starting-bigdata-aws
Users that are interested in starting-bigdata-aws are comparing it to the libraries listed below
Sorting:
- Labs and data files for a full-day Spark workshop☆24Updated 8 months ago
- Tweet Analysis with Spark☆15Updated 8 years ago
- Ingest tweets with Kafka. Use Spark to track popular hashtags and trendsetters for each hashtag☆29Updated 9 years ago
- Code examples and docker environment for Spark☆28Updated 9 years ago
- A workshop about implementing graph theory with Neo4j☆77Updated 8 years ago
- Apache Spark docker container image (Standalone mode)☆35Updated 5 years ago
- matching between unstructured and structured data sets☆14Updated 7 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆62Updated last year
- An umbrella project for multiple implementations of model serving☆45Updated 8 years ago
- Use Kafka and Apache Spark streaming to perform click stream analytics☆76Updated 5 years ago
- ☆19Updated 8 years ago
- Real-world Spark pipelines examples☆83Updated 7 years ago
- RedRock - Mobile Application prototype using Apache Spark, Twitter and Elasticsearch☆14Updated 7 years ago
- ☆10Updated 3 years ago
- Code from the book Machine Learning Systems☆145Updated 7 years ago
- Weekly Data Engineering Newsletter☆96Updated last year
- The purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacio…☆64Updated 7 years ago
- Repository of Notebooks taken from https://neo4j.com/graph-algorithms-book/☆26Updated 5 years ago
- Friendly ML feature store☆45Updated 3 years ago
- Flink stream filtering examples☆19Updated 9 years ago
- Data quality control tool built on spark and deequ☆25Updated last week
- ☆35Updated 9 years ago
- Making Machine Learning Simple and Scalable with Python, Jupyter Notebook, TensorFlow, Keras, Apache Kafka and KSQL☆97Updated 7 years ago
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- Dynamic Distributed Dimensional Data Model☆42Updated last year
- Large-scale Graph Mining with Spark☆39Updated 7 years ago
- Cheatsheet for Spark DataFrame☆91Updated 6 years ago
- Flowchart for debugging Spark applications☆106Updated last year
- Repository used for Spark Trainings☆54Updated 2 years ago
- MLflow App Library☆77Updated 7 years ago