This project describes how to write full ETL data pipeline using spark.
☆15Oct 15, 2022Updated 3 years ago
Alternatives and similar repositories for spark-data-pipeline
Users that are interested in spark-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Kafka Connect connector for receiving data and writing data to Splunk.☆25Nov 7, 2017Updated 8 years ago
- This is an activator project for showcasing how to read & write data from Kafka-cluster using Scala Producer & Consumer API.☆11May 28, 2017Updated 8 years ago
- This is an activator project for showcasing how to read & write data from Kafka-cluster using Java Producer & Consumer API.☆11May 24, 2017Updated 9 years ago
- An ETL framework in Scala for Data Engineers☆23Aug 30, 2022Updated 3 years ago
- This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.☆40Aug 31, 2016Updated 9 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This is an activator project for providing a seed for starting with Akka-Http and Slick.☆14May 28, 2017Updated 8 years ago
- Creating Data Pipelines with Apache Airflow to manage ETL from Amazon S3 into Amazon Redshift☆14Jun 12, 2019Updated 6 years ago
- This is an activator project for showcasing best practices, writing unit test and providing a seed for starting with Slick.☆13May 28, 2017Updated 8 years ago
- This is a Play activator project. It's describe how to build autocomplete search on the Elasticsearch.☆15May 24, 2017Updated 9 years ago
- This is an activator project providing a seed for starting with Play & Slick using AngularJS☆14May 24, 2017Updated 9 years ago
- This project provides valuable customer sentiment insights for Zomato by tracking and analyzing tweets related to their brand and service…☆14Aug 27, 2023Updated 2 years ago
- An Apache Spark app for making data movement between Apache Hive and Apache Phoenix/HBase☆14Mar 23, 2016Updated 10 years ago
- low-level helpers for Apache Spark libraries and tests☆16Dec 29, 2018Updated 7 years ago
- ☆21Jan 13, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Create Kafka-Connect clusters with docker . You put the Kafka, we put the Connect.☆25Mar 27, 2019Updated 7 years ago
- Ansible scripts for deploying Kafka on EC2☆10Oct 7, 2016Updated 9 years ago
- ☆10May 25, 2017Updated 9 years ago
- Kafka Sink Connect OrientDB https://www.confluent.io/hub/sanjuthomas/kafka-connect-orientdb☆10Jan 26, 2026Updated 3 months ago
- Real-time motion planner and autonomous vehicle simulator in the browser, built with WebGL and Three.js.☆13Mar 3, 2023Updated 3 years ago
- With this library, you can embed Python to your Java or Scala project. The main purpose of this library is to use Python libraries from J…☆12Aug 25, 2024Updated last year
- Docker example with kafka connect and sink☆12Feb 12, 2018Updated 8 years ago
- ☆20Apr 27, 2012Updated 14 years ago
- 🎁 Shows recommended files in Nextcloud☆15Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆14Aug 22, 2025Updated 9 months ago
- 观点型问题阅读理解 challenger.ai☆10Nov 14, 2018Updated 7 years ago
- Materials for various Hadoop & Nifi related workshops☆51Mar 20, 2019Updated 7 years ago
- https://github.com/uavorg/uavstack☆10Sep 11, 2017Updated 8 years ago
- 基于mapboxgl、mapboxgl-draw、turf测量控件☆12Nov 22, 2022Updated 3 years ago
- free bike for everyone☆15Aug 20, 2019Updated 6 years ago
- ☆12Mar 15, 2022Updated 4 years ago
- iPython Notebook of the Guide to Data Mining☆20Apr 7, 2013Updated 13 years ago
- A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.☆12Apr 1, 2017Updated 9 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- CIM基础开发平台后端 基于若依框架 BIM+GIS☆11May 25, 2022Updated 4 years ago
- Leveraging Hortonworks' HDP 3.1.0 and HDF 3.4.0 components, this tutorial guides the user through steps to stream data from a REST API in…☆19Aug 16, 2019Updated 6 years ago
- A module extracting the data from PostGIS to mbtiles by using tippecanoe.☆16Jan 31, 2026Updated 3 months ago
- ☆15Jan 17, 2022Updated 4 years ago
- This is an activator project providing a seed for starting with Play & Slick, how to write unit test and how to use mocking for unit test…☆27Sep 4, 2017Updated 8 years ago
- Legoo: A collection of automation modules to build analytics infrastructure☆20Jul 24, 2020Updated 5 years ago
- Code for Tutorial on designing clickstream analytics application using Hadoop☆55May 20, 2015Updated 11 years ago