A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill using Docker and Cassandra (NoSQL DB) for storage; This allows for for fast feature engineering and data cleaning.
☆28Jul 8, 2019Updated 6 years ago
Alternatives and similar repositories for PySpark-Confluent-Kafka-Apache-Drill-
Users that are interested in PySpark-Confluent-Kafka-Apache-Drill- are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Sep 3, 2020Updated 5 years ago
- Stripe Payment Gateway integration in Django☆10May 24, 2021Updated 5 years ago
- Collection of Databricks and Jupyter Notebooks☆22Feb 9, 2026Updated 4 months ago
- Version 1 of Habaneras de Lino is an online ecommerce. This repo contains the backed api of the website using Django and Django Rest Fram…☆13Dec 16, 2022Updated 3 years ago
- In this work, we compared the predictive capabilities of six different machine learning algorithms - linear regression, random forest, ex…☆16Sep 21, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Projects from Udacity Data Streaming Nanodegree☆15Aug 14, 2023Updated 2 years ago
- Kaggle Human Protein Atlas Image Classification 73th solution☆19Jan 14, 2019Updated 7 years ago
- This repository is archived. Please navigate to: https://github.com/IBM/watson-machine-learning-samples☆39Sep 3, 2020Updated 5 years ago
- Food Ordering Management System PHP & MySQL Project☆12Dec 9, 2019Updated 6 years ago
- Article for Special Edition of Information: Machine Learning with Python☆14Jan 8, 2025Updated last year
- This repo is for building Docker containers for RStudio, PostgreSQL, Hadoop, Spark, etc.☆22May 12, 2021Updated 5 years ago
- ☕⛵WIP PySpark dependency management☆22Jul 8, 2018Updated 7 years ago
- A dead-simple digital menu board display and configuration, written in Python.☆25Apr 28, 2026Updated 2 months ago
- AWS Big Data Certification☆25Mar 26, 2026Updated 3 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Adaptive Machine Learning for Credit Card Fraud Detection☆37Sep 4, 2017Updated 8 years ago
- Code repository for Big Data Analytics with R, published by Packt☆27Mar 2, 2026Updated 3 months ago
- Insurance Claim Prediction using Machine Learning - Udacity Nanodegree Capstone Project☆16Nov 1, 2016Updated 9 years ago
- Silver and Bronze medal solutions to the Kaggle challenges on Google Landmark Dataset☆18Jun 9, 2019Updated 7 years ago
- QuasiModo: Assessing viral genomic analysis methods on HCMV strain mixture☆12Sep 22, 2022Updated 3 years ago
- ☆10Feb 14, 2019Updated 7 years ago
- Set up an automated data science environment using Docker☆14Oct 2, 2018Updated 7 years ago
- MeatPy☆34Jun 22, 2026Updated last week
- Stock closing and opening forecasting using Deep neural network and LSTM(technical indicators included)☆19Oct 22, 2017Updated 8 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Web UI for running/managing/monitoring of ML jobs☆15May 19, 2019Updated 7 years ago
- Presentations & notebooks from our talks /workshops/meetups/etc☆24Mar 23, 2018Updated 8 years ago
- Developing a Lambda Architecture pipeline using Apache Kafka, Spark Structured Streaming, Redshift, S3, Python☆22Mar 8, 2020Updated 6 years ago
- My Go Solution for Leetcode☆21May 5, 2020Updated 6 years ago
- A machine learning algorithm written to predict severity of insurance claim☆19Nov 14, 2016Updated 9 years ago
- Python package for Feature-based Forecast Model Averaging (FFORMA).☆11Jul 6, 2023Updated 2 years ago
- Install micromamba, and optionally create a base conda environment.☆10Apr 5, 2025Updated last year
- Stencila for Python☆17Aug 3, 2018Updated 7 years ago
- ☆14Nov 8, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A collection of complete apps aiming to demonstrate intermediate to advanced usage of new Shiny features.☆25May 14, 2021Updated 5 years ago
- 📈 EDA and ML on Health Insurance Claims☆17Aug 17, 2018Updated 7 years ago
- An online food ordering system can be defined as software that allows restaurant businesses to accept and manage orders placed over the i…☆33May 8, 2022Updated 4 years ago
- Alternative CLI tool and Go package for NodeMCU-based modules.☆14Jan 14, 2021Updated 5 years ago
- ☆12May 5, 2021Updated 5 years ago
- Parameter Optimization Functions for 'simmer'☆15Dec 22, 2022Updated 3 years ago
- Quora Kaggle Competition : Natural Language Processing using word2vec embeddings, scikit-learn and xgboost for training☆18Jan 13, 2019Updated 7 years ago