A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill using Docker and Cassandra (NoSQL DB) for storage; This allows for for fast feature engineering and data cleaning.
☆28Jul 8, 2019Updated 6 years ago
Alternatives and similar repositories for PySpark-Confluent-Kafka-Apache-Drill-
Users that are interested in PySpark-Confluent-Kafka-Apache-Drill- are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A reproducible containerized environment with CUDA X, Anaconda, TensorFlow-GPU, Keras-GPU, Dask, and PyCUDA.☆24Aug 30, 2021Updated 4 years ago
- Udacity Data Engineer Nano Degree - Project-3 (Data Warehouse)☆22Jun 20, 2019Updated 6 years ago
- ☆13Sep 3, 2020Updated 5 years ago
- Machine Learning for Industrial IoT Applications: Predict how long a part will work before performance degrades Perect for 5G cell phone…☆39Aug 30, 2021Updated 4 years ago
- Apche Spark Structured Streaming with Kafka using Python(PySpark)☆40May 16, 2019Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Deploy a Flask-based microservice (along with Postgres and React) to a Kubernetes cluster☆18May 6, 2021Updated 5 years ago
- A simple POC app on Django framework☆11Feb 14, 2019Updated 7 years ago
- Collection of Databricks and Jupyter Notebooks☆22Feb 9, 2026Updated 4 months ago
- Kafka-connect telegram connector☆16Nov 21, 2025Updated 6 months ago
- Version 1 of Habaneras de Lino is an online ecommerce. This repo contains the backed api of the website using Django and Django Rest Fram…☆13Dec 16, 2022Updated 3 years ago
- MongoDB Change Streams and Kafka Example Application☆14Nov 16, 2017Updated 8 years ago
- Projects from Udacity Data Streaming Nanodegree☆15Aug 14, 2023Updated 2 years ago
- 10小时搞定Latex排版☆12Jun 22, 2018Updated 7 years ago
- Analytics tool that applies Natural Language Processing (NLP) and Machine Learning (ML), such as concept extraction, idea classification,…☆10Dec 7, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Slides from my talk on spaCy IRL, regarding sparse attention.☆12Jul 9, 2019Updated 6 years ago
- StyleGAN - Official TensorFlow Implementation☆11Jun 2, 2019Updated 7 years ago
- Python library for deploying models built using Python to Alteryx Promote.☆15Dec 10, 2021Updated 4 years ago
- Article for Special Edition of Information: Machine Learning with Python☆14Jan 8, 2025Updated last year
- ☆37Jul 8, 2019Updated 6 years ago
- Automate claim approval in personal insurance sector.☆20Apr 21, 2016Updated 10 years ago
- ☆16Jun 18, 2025Updated 11 months ago
- This repository focuses on providing interview scenario questions that I have encountered during interviews. The questions are designed t…☆52Feb 11, 2025Updated last year
- ☕⛵WIP PySpark dependency management☆22Jul 8, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Starter Code (R and Python) for all CSV data sets of opendata.swiss☆14Feb 22, 2026Updated 3 months ago
- WARNING: This repository is no longer maintained ⚠️ This repository will not be updated.☆12May 31, 2022Updated 4 years ago
- AWS Big Data Certification☆25Mar 26, 2026Updated 2 months ago
- 7th place code at NFL Big Data Bowl☆12Jan 8, 2020Updated 6 years ago
- Adaptive Machine Learning for Credit Card Fraud Detection☆37Sep 4, 2017Updated 8 years ago
- DevOps for AI project using Azure Databricks, Azure DevOps and Azure Machine Learning Service☆15Jul 21, 2021Updated 4 years ago
- Recency, Frequency, and Monetary are three behavioral attributes and are quite simple, in that they can be easily computed for any databa…☆15Nov 20, 2025Updated 6 months ago
- Insurance Claim Prediction using Machine Learning - Udacity Nanodegree Capstone Project☆16Nov 1, 2016Updated 9 years ago
- This container is no longer supported, and has been deprecated in favor of: https://github.com/joehoeller/NVIDIA-GPU-Tensor-Core-Accelera…☆45Aug 30, 2021Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A react-typescript component for Plotly.JS graphs.☆15Feb 29, 2020Updated 6 years ago
- Study Guide for AWS Big Data Speciality Certification☆19May 27, 2019Updated 7 years ago
- Classifying malignant and benign tumors using Neural Networks 🔬☆18Jun 4, 2021Updated 5 years ago
- ☆31Apr 10, 2023Updated 3 years ago
- R package 2013 google trend☆15Jan 5, 2015Updated 11 years ago
- QuasiModo: Assessing viral genomic analysis methods on HCMV strain mixture☆12Sep 22, 2022Updated 3 years ago
- A collection of my NLP projects☆19Aug 26, 2019Updated 6 years ago