ksbg / sparklanes
A lightweight data processing framework for Apache Spark
☆16Updated 2 years ago
Alternatives and similar repositories for sparklanes:
Users that are interested in sparklanes are comparing it to the libraries listed below
- A four-day course on Python, the Scientific Python stack and PySpark, adapted from a training course I gave to one of our clients in Dece…☆10Updated 9 years ago
- scaffold of Apache Airflow executing Docker containers☆85Updated 2 years ago
- A curated list of all the awesome examples, articles, tutorials and videos for Apache Airflow.☆96Updated 4 years ago
- A simple introduction to using spark ml pipelines☆26Updated 7 years ago
- A simple Spark TDD example☆26Updated 7 years ago
- ☆19Updated 4 years ago
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 7 years ago
- Model management example using Polyaxon, Argo and Seldon☆23Updated 6 years ago
- pyspark sample scripts☆17Updated 6 years ago
- A curated list of articles, papers and tools for managing the building and deploying of machine learning models, aka machine learning eng…☆18Updated 6 years ago
- Simple demonstration of how to build a complex real time machine learning visualization tool.☆16Updated 9 years ago
- ☆25Updated 6 years ago
- Airflow training for the crunch conf☆105Updated 6 years ago
- Various data stream/batch process demo with Apache Scala Spark 🚀☆11Updated 5 years ago
- Updated repository☆157Updated 3 years ago
- Real-time report dashboard with Apache Kafka, Apache Spark Streaming and Node.js☆50Updated last year
- Repository used for Spark Trainings☆53Updated 2 years ago
- ELT Code for your Data Warehouse☆26Updated last year
- Using Luigi to create a Machine Learning Pipeline using the Rossman Sales data from Kaggle☆33Updated 8 years ago
- Ingest tweets with Kafka. Use Spark to track popular hashtags and trendsetters for each hashtag☆29Updated 9 years ago
- PySpark phonetic and string matching algorithms☆39Updated last year
- Analysis of City Of Chicago Taxi Trip Dataset Using AWS EMR, Spark, PySpark, Zeppelin and Airbnb's Superset☆15Updated 7 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Asynchronous actions for PySpark☆47Updated 3 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆51Updated 8 years ago
- Just a boilerplate for PySpark and Flask☆35Updated 6 years ago
- ☆11Updated 6 years ago
- PyConDE & PyData Berlin 2019 Airflow Workshop: Airflow for machine learning pipelines.☆47Updated last year
- Code Repository for the EVO-ODAS☆31Updated 7 years ago