Playground for pyspark (RDDs, DStreams) and Apache Airflow. Based on the example of parsing (including incorrectly formated strings) web server log data
☆18Feb 21, 2022Updated 4 years ago
Alternatives and similar repositories for Web-Server-Log-Analysis-PySpark
Users that are interested in Web-Server-Log-Analysis-PySpark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Data warehouse implementation for an e-commerce website “Infibeam” that sells digital and consumer electronics.☆22Jan 28, 2018Updated 8 years ago
- A complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki page…☆19Oct 16, 2019Updated 6 years ago
- Hands-on : Udemy course, Advanced SQL: MySQL Data Analytics & Business Intelligence, Maven Analytics. Requirements - MySQL Workbench☆20Sep 13, 2023Updated 2 years ago
- Project submission for data engineering zoomcamp 2023 - https://github.com/DataTalksClub/data-engineering-zoomcamp☆10Apr 27, 2023Updated 2 years ago
- Classification problem to predict loan defaulters using Lending Club Dataset☆11Jan 26, 2019Updated 7 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- A simple php toolbox to interact with the Microsoft Azure Search Service REST API.☆11Feb 2, 2023Updated 3 years ago
- End-to-end data engineering pipeline with various technologies to ingest real time data.☆25Nov 3, 2023Updated 2 years ago
- Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups☆18Sep 17, 2018Updated 7 years ago
- Data Engineering Project at Insight☆15Nov 17, 2015Updated 10 years ago
- ☆10Apr 3, 2019Updated 7 years ago
- Deep learning model of depression detection from activity sensor data☆14Dec 17, 2021Updated 4 years ago
- 📚🧪 Traffic Sentinel is a learning-focused POC that explores a scalable IoT architecture using Fog nodes and Apache Flink to process 📷 …☆28Dec 29, 2025Updated 3 months ago
- Flask based Web application for predicting the income of a person☆13Dec 23, 2018Updated 7 years ago
- Used Yolact++ to implement social distance monitoring.☆18Feb 16, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- This Repository Contains Different ways to do sentimental Analysis☆20Sep 17, 2022Updated 3 years ago
- This project consists of advanced phishing detection using the BERT masked language model.☆28Jan 31, 2024Updated 2 years ago
- Case Studies and Projects in Machine Learning/EDA/DL☆24Jun 18, 2024Updated last year
- This repo is for linkedin learning course: Complete Guide to SQL for Data Engineering: from Beginner to Advanced☆46Mar 20, 2025Updated last year
- A Data Visualization project on the French traffic accidents database☆19Aug 27, 2019Updated 6 years ago
- ☆13Jun 19, 2018Updated 7 years ago
- Scrape Facebook post comment with all comments reply☆15Mar 20, 2021Updated 5 years ago
- The code for the paper "AnomalyLLM: Few-shot Anomaly Edge Detection for Dynamic Graphs using Large Language Models"☆23Aug 28, 2024Updated last year
- This repository applies Deep Learning techniques for depression detection in text, using LSTM, GRU, BiLSTM, BERT models, and a baseline F…☆19Jul 14, 2023Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- This repository contain Data Analysis on Black Friday Sales Data using various Regression ML algorithms☆21Apr 8, 2025Updated last year
- ☆19Feb 7, 2017Updated 9 years ago
- Here's how to get DataQuest's Data Engineering Track missions' content to work on your localhost. Using data from my Valenbisi ARIMA mode…☆17Jul 17, 2018Updated 7 years ago
- A simple webapp for memorizing multiple choice answers☆16Mar 19, 2021Updated 5 years ago
- iTASK - Intelligent Traffic Analysis Software Kit☆29Dec 8, 2022Updated 3 years ago
- RealTime StockStream is a streamlined, simulation system for processing live stock market data. It uses Apache Kafka for data input, Apac…☆31Feb 18, 2025Updated last year
- ☆30Jan 17, 2023Updated 3 years ago
- Data visualisations in Power BI☆31Nov 14, 2021Updated 4 years ago
- Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.☆12Jul 16, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A project to detect accident and send notification to hospitals whenever a accident happens.☆20Mar 22, 2023Updated 3 years ago
- ( These solutions tested on 4 node Hortonwork cluster on my laptop. Do not test on your production environment until you test... :)☆20Apr 18, 2020Updated 5 years ago
- Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers…☆20Apr 24, 2025Updated 11 months ago
- Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆24Apr 27, 2023Updated 2 years ago
- This repo is for the Linkedin Learning course: End-to-End Data Engineering Project☆31Nov 9, 2023Updated 2 years ago
- A UI state management library to build js apps against Azure Search☆23Oct 30, 2018Updated 7 years ago
- Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.☆21Jan 30, 2019Updated 7 years ago