Playground for pyspark (RDDs, DStreams) and Apache Airflow. Based on the example of parsing (including incorrectly formated strings) web server log data
☆18Feb 21, 2022Updated 4 years ago
Alternatives and similar repositories for Web-Server-Log-Analysis-PySpark
Users that are interested in Web-Server-Log-Analysis-PySpark are comparing it to the libraries listed below
Sorting:
- Classification problem to predict loan defaulters using Lending Club Dataset☆11Jan 26, 2019Updated 7 years ago
- A web application which acts as an IoT device when loaded in a smart phone browser. The data from the sensors are then used for Anomaly d…☆11Feb 4, 2021Updated 5 years ago
- ☆11Feb 11, 2020Updated 6 years ago
- ☆10Apr 3, 2019Updated 6 years ago
- A simple php toolbox to interact with the Microsoft Azure Search Service REST API.☆11Feb 2, 2023Updated 3 years ago
- ☆13Jun 19, 2018Updated 7 years ago
- Series on Tensorflow starting from the basics and working our way up to more complex models☆18Sep 1, 2018Updated 7 years ago
- A simple webapp for memorizing multiple choice answers☆16Mar 19, 2021Updated 4 years ago
- Here's how to get DataQuest's Data Engineering Track missions' content to work on your localhost. Using data from my Valenbisi ARIMA mode…☆17Jul 17, 2018Updated 7 years ago
- Pyspark Spotify ETL☆17Aug 19, 2021Updated 4 years ago
- Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.☆13Jul 16, 2019Updated 6 years ago
- ☆19Feb 7, 2017Updated 9 years ago
- Anomaly detection training suite☆119Nov 10, 2015Updated 10 years ago
- A python script that uses the Tweepy library to pull Tweets from Twitter's Streaming API, and then stores the important fields in a Mongo…☆23Aug 19, 2014Updated 11 years ago
- ☆25Oct 13, 2019Updated 6 years ago
- Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups☆18Sep 17, 2018Updated 7 years ago
- ☆32Oct 14, 2024Updated last year
- Data Quest - Data Engineer Learning and Projects☆24May 29, 2019Updated 6 years ago
- Project analyzing Airbnb Rental data☆33Dec 13, 2018Updated 7 years ago
- This repo is for the Linkedin Learning course: End-to-End Data Engineering Project☆29Nov 9, 2023Updated 2 years ago
- A simple Spark TDD example☆26Sep 19, 2017Updated 8 years ago
- Market Basket Analysis with Recommendation Algorithms & Shiny App Implementation of a Product Recommendation System for an Online Retaile…☆32Oct 29, 2019Updated 6 years ago
- ☆32Jul 18, 2018Updated 7 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Jun 7, 2021Updated 4 years ago
- WARNING: This repository is no longer maintained This repository will not be updated. The repository will be kept available in read-onl…☆30May 10, 2021Updated 4 years ago
- Project - Data Processing and Analysis in Python Course☆39Oct 10, 2018Updated 7 years ago
- AnyAPI is a library that helps you to write any API wrapper with ease and in pythonic way.☆131Nov 7, 2021Updated 4 years ago
- A series of Jupyter Notebooks which build a simple CNN model which is trained using the Fashion MNIST dataset☆37Apr 5, 2018Updated 7 years ago
- Flask app to push/pull on Kafka over HTTP☆41Feb 27, 2015Updated 11 years ago
- Live stream tweets based on keywords to database using SQLAlchemy. Tweets are assigned a sentiment score and data is presented via stream …☆43Nov 28, 2020Updated 5 years ago
- A simple web crawler, using Abot, that indexes page contents into Azure Search.☆49Oct 7, 2024Updated last year
- A highly customisable Intelligent Personal Assistant☆43Mar 12, 2019Updated 6 years ago
- A collection of Medium posts☆55Oct 26, 2018Updated 7 years ago
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆53Aug 25, 2016Updated 9 years ago
- End to end data engineering project☆58Oct 27, 2022Updated 3 years ago
- This is the code for "Kaggle Challenge (LIVE)" by Siraj Raval on Youtube☆65Sep 21, 2018Updated 7 years ago
- A sample app for automated phone surveys with Twilio, TwiML, Python and Django☆55Jul 13, 2023Updated 2 years ago
- Examples on data science as reference to blog posts.☆57Oct 5, 2018Updated 7 years ago
- Prediction of loan defaulter based on more than 5L records using Python, Numpy, Pandas and XGBoost☆66Sep 4, 2022Updated 3 years ago