This project describes how to write full ETL data pipeline using spark.
☆15Oct 15, 2022Updated 3 years ago
Alternatives and similar repositories for spark-data-pipeline
Users that are interested in spark-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Jan 1, 2020Updated 6 years ago
- My documents for self-learning fundamental of Data engineering skills☆14Aug 5, 2023Updated 2 years ago
- ☆64Nov 8, 2019Updated 6 years ago
- This project provides valuable customer sentiment insights for Zomato by tracking and analyzing tweets related to their brand and service…☆14Aug 27, 2023Updated 2 years ago
- An Apache Spark app for making data movement between Apache Hive and Apache Phoenix/HBase☆14Mar 23, 2016Updated 10 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- low-level helpers for Apache Spark libraries and tests☆16Dec 29, 2018Updated 7 years ago
- ☆21Jan 13, 2024Updated 2 years ago
- Simulation of job offers and CVs with real-time processing, classification, and analytics using Kafka, Ray, Spark, and Databricks. Includ…☆14Dec 25, 2024Updated last year
- Create Kafka-Connect clusters with docker . You put the Kafka, we put the Connect.☆25Mar 27, 2019Updated 7 years ago
- Ansible scripts for deploying Kafka on EC2☆10Oct 7, 2016Updated 9 years ago
- ☆10May 25, 2017Updated 9 years ago
- Kafka Sink Connect OrientDB https://www.confluent.io/hub/sanjuthomas/kafka-connect-orientdb☆10Jan 26, 2026Updated 4 months ago
- Kafka Connect connector for CDC data from postgres☆11Aug 27, 2017Updated 8 years ago
- Docker example with kafka connect and sink☆12Feb 12, 2018Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for my talk "Stateful & Reactive Streaming Applications Without a Database" at WeAreDevelopers 2018☆11May 20, 2018Updated 8 years ago
- My HackerRank Solutions : https://www.hackerrank.com/RohanKhude☆12Jul 13, 2016Updated 9 years ago
- ☆20Apr 27, 2012Updated 14 years ago
- ☆14May 21, 2026Updated 3 weeks ago
- Template for a DuckDB-based, Codespace-oriented sandbox project that is also dbt Cloud compatible, and includes code-first BI tooling via…☆17Apr 7, 2023Updated 3 years ago
- https://github.com/uavorg/uavstack☆10Sep 11, 2017Updated 8 years ago
- A description of the processes and techniques required to migrate a relational schema to a Cassandra database using Spark and SparkSQL☆11Jan 27, 2018Updated 8 years ago
- 基于mapboxgl、mapboxgl-draw、turf测量控件☆12Nov 22, 2022Updated 3 years ago
- free bike for everyone☆15Aug 20, 2019Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Event ticketing system with Next.js and Appwrite☆10Jun 22, 2023Updated 2 years ago
- ☆12Mar 15, 2022Updated 4 years ago
- Leveraging Hortonworks' HDP 3.1.0 and HDF 3.4.0 components, this tutorial guides the user through steps to stream data from a REST API in…☆19Aug 16, 2019Updated 6 years ago
- A module extracting the data from PostGIS to mbtiles by using tippecanoe.☆16Jan 31, 2026Updated 4 months ago
- Context-aware AI dictionary for books, manga & comics. Neural TTS (Piper), IPA generation, PaddleOCR, multi-word lookup. Supports cloud &…☆19Apr 20, 2026Updated last month
- ☆15Jan 17, 2022Updated 4 years ago
- Legoo: A collection of automation modules to build analytics infrastructure☆20Jul 24, 2020Updated 5 years ago
- Code for Tutorial on designing clickstream analytics application using Hadoop☆54May 20, 2015Updated 11 years ago
- NewsApp包含客户端源码、服务端源码、数据库文件。 基于Miscrosoft人工智能项目ProjectOxford中的Recognition Emotion做的, 主要是基于用户的面部表情来推送不同类别的新闻。 Emotion API可以参考:https://www.p…☆10Mar 2, 2016Updated 10 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Reading list of the topic about utilizing vehicle generated GPS data to update road networks☆14Jul 18, 2018Updated 7 years ago
- Spark stream from kafka(json) to s3(parquet)☆15Nov 8, 2018Updated 7 years ago
- Playground for programing your self driving car 🚙☆12Jun 4, 2024Updated 2 years ago
- Algorithms and Data Structures implemented in Java☆12Jul 28, 2019Updated 6 years ago
- Realtime Geofencing using Spark streaming for vehicle tracking / fleet management usecase☆12Jul 22, 2019Updated 6 years ago
- ArangoDB Connector for Apache Spark, using the Spark DataSource API☆12May 11, 2026Updated last month
- Data Quality Monitoring Tool☆15Dec 5, 2017Updated 8 years ago