Apache Spark based ETL Engine
☆71Oct 18, 2016Updated 9 years ago
Alternatives and similar repositories for spark-etl
Users that are interested in spark-etl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A light Kafka to HDFS/S3 ETL library based on Apache Spark☆40Jun 29, 2017Updated 8 years ago
- Bottoku, Micro Framework for Chat/Messenger Bots☆10Sep 21, 2016Updated 9 years ago
- Serviceframework一个简单但灵活的模块引擎☆31Jun 29, 2017Updated 8 years ago
- Fast-Data-Processing-with-Spark-2☆22Jan 18, 2023Updated 3 years ago
- 蜜蜂牧场是一个数据采集清洗工具,也是一个ETL工具,同时也是一套脚本语言。☆14Jul 1, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Big Data ETL and Utilities for Hadoop Map Reduce, Spark and Storm☆106Jan 22, 2024Updated 2 years ago
- Distributed SQL base Realtime Streaming Computation Framework On Apache Storm, Spark☆12Mar 14, 2016Updated 10 years ago
- Set of ETL utils for Spark☆15May 4, 2020Updated 6 years ago
- This project is a unified ETL platform that support various data processing technologies, including Spark, Hive, Hadoop, Python, Linux Sh…☆17Oct 16, 2015Updated 10 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Oct 8, 2025Updated 7 months ago
- ☆50May 21, 2026Updated last week
- 该项目主要是为了熟悉sql的人员能够很方便的进行elasticsearch数据的查询,降低学习成本。☆47Jan 13, 2015Updated 11 years ago
- Spark NLP for Streamlit☆15Sep 12, 2021Updated 4 years ago
- an open source dataworks platform☆20Jun 4, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Few things we've met during our etl project based on spark☆24Mar 22, 2018Updated 8 years ago
- A demo repository for "streaming etl" with Apache Flink☆44Jun 8, 2016Updated 9 years ago
- Second generation of the ICGC DCC release ETL built on Spark☆10Apr 8, 2019Updated 7 years ago
- HiveQL Parser. Parse HiveQL code and print AST in JSON format if success, else print well formed syntax error message.☆25May 1, 2017Updated 9 years ago
- Code for KDD 2014 paper "Mining Topics in Documents: Standing on the Shoulders of Big Data"☆21Oct 6, 2015Updated 10 years ago
- SQLAlchemy dialect for Dremio☆16Feb 8, 2023Updated 3 years ago
- Mirror kept for legacy. Moved to https://github.com/llvm/llvm-project☆17Dec 14, 2016Updated 9 years ago
- Scalable recommendation system written in Scala using the Apache Spark framework☆105Jan 30, 2015Updated 11 years ago
- Demonstrates how to submit a job to Spark on HDP directly via YARN's REST API from any workstation☆23Apr 18, 2016Updated 10 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- It is a kind of big data computing platform which is driven by the Flink SQL. In particular, it provides the SQL programming.☆21Jan 5, 2023Updated 3 years ago
- DuckDB extension to allow quacking with PostgreSQL protocol☆28Nov 20, 2024Updated last year
- Quick Akka Micro Dag Prototype☆13Apr 8, 2016Updated 10 years ago
- Open source task scheduler with dependency management☆15Jul 1, 2018Updated 7 years ago
- A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR☆120Mar 28, 2016Updated 10 years ago
- Apache Spark Web Monitor Tool, varOne☆36Aug 26, 2016Updated 9 years ago
- customer visualization for splunk using echarts☆15May 11, 2017Updated 9 years ago
- A simple Spark-powered ETL framework that just works 🍺☆186Oct 2, 2025Updated 7 months ago
- Kafka River Plugin for ElasticSearch☆88Jun 19, 2013Updated 12 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Implement a complete data warehouse etl using spark SQL☆14Sep 8, 2022Updated 3 years ago
- Build configuration-driven ETL pipelines on Apache Spark☆162Oct 4, 2022Updated 3 years ago
- Redis search and indexing in Java☆16Sep 26, 2016Updated 9 years ago
- flink-sql 在 flink 上运行 sql 和 构建数据流的平台 基于 apache flink 1.10.0☆113Jun 21, 2022Updated 3 years ago
- 提供清晰、实用的Akka应用指导☆31Jan 17, 2022Updated 4 years ago
- json或SQL语言转为flink或者spark流/批任务☆12Jun 21, 2022Updated 3 years ago
- ☆14Oct 5, 2022Updated 3 years ago