Apache Spark based ETL Engine
☆71Oct 18, 2016Updated 9 years ago
Alternatives and similar repositories for spark-etl
Users that are interested in spark-etl are comparing it to the libraries listed below
Sorting:
- A light Kafka to HDFS/S3 ETL library based on Apache Spark☆40Jun 29, 2017Updated 8 years ago
- Serviceframework一个简单但灵活的模块引擎☆31Jun 29, 2017Updated 8 years ago
- Distributed SQL base Realtime Streaming Computation Framework On Apache Storm, Spark☆12Mar 14, 2016Updated 9 years ago
- Set of ETL utils for Spark☆15May 4, 2020Updated 5 years ago
- Big Data ETL and Utilities for Hadoop Map Reduce, Spark and Storm☆104Jan 22, 2024Updated 2 years ago
- ☆50Feb 11, 2020Updated 6 years ago
- 该项目主要是为了熟悉sql的人员能够很方便的进行elasticsearch数据的查询,降低学习成本。☆47Jan 13, 2015Updated 11 years ago
- json或SQL语言转为flink或者spark流/批任务☆12Jun 21, 2022Updated 3 years ago
- It is a kind of big data computing platform which is driven by the Flink SQL. In particular, it provides the SQL programming.☆21Jan 5, 2023Updated 3 years ago
- 数据交换☆10Jun 5, 2024Updated last year
- Kafka River Plugin for ElasticSearch☆88Jun 19, 2013Updated 12 years ago
- DataSphere 产品文档☆12Sep 25, 2019Updated 6 years ago
- Open source task scheduler with dependency management☆15Jul 1, 2018Updated 7 years ago
- Java Client of the Spark Job Server implementing the arranged Rest APIs☆51Jun 4, 2021Updated 4 years ago
- Spark, Spark Streaming and Spark SQL unit testing strategies☆215Oct 12, 2016Updated 9 years ago
- Few things we've met during our etl project based on spark☆24Mar 22, 2018Updated 7 years ago
- Demonstrates how to submit a job to Spark on HDP directly via YARN's REST API from any workstation☆23Apr 18, 2016Updated 9 years ago
- Quick Akka Micro Dag Prototype☆13Apr 8, 2016Updated 9 years ago
- Mirror kept for legacy. Moved to https://github.com/llvm/llvm-project☆17Dec 14, 2016Updated 9 years ago
- Scala API for Apache Spark SQL high-order functions☆14Aug 4, 2023Updated 2 years ago
- A Lightweight Graph Processing Framework for Multi-GPUs☆14Apr 15, 2015Updated 10 years ago
- SQLAlchemy dialect for Dremio☆16Feb 8, 2023Updated 3 years ago
- Build configuration-driven ETL pipelines on Apache Spark☆161Oct 4, 2022Updated 3 years ago
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆29May 15, 2020Updated 5 years ago
- 提供清晰、实用的Akka应用指导☆31Jan 17, 2022Updated 4 years ago
- ☆14May 23, 2017Updated 8 years ago
- customer visualization for splunk using echarts☆15May 11, 2017Updated 8 years ago
- Trident State implementation on top of Elasticsearch☆21May 18, 2015Updated 10 years ago
- Hive,Pig,Hbase,Sqoop examples☆15Apr 24, 2017Updated 8 years ago
- flink-sql 在 flink 上运行 sql 和 构建数据流的平台 基于 apache flink 1.10.0☆112Jun 21, 2022Updated 3 years ago
- Kettle Online Business Intelligence Platform -- Pentaho Data Integration ( ETL ) a.k.a Kettle☆18May 26, 2017Updated 8 years ago
- Spark NLP for Streamlit☆15Sep 12, 2021Updated 4 years ago
- PowerSwitch: a adaptive mode switch engine for distributed parrallel graph computation☆16Dec 23, 2013Updated 12 years ago
- Klunge is a platform for Event Sourcing and CQRS to build scalable event-driven and eventual consistent systems☆13Mar 6, 2021Updated 5 years ago
- 基于ActiveMQ的数据交换中间件☆14Aug 17, 2014Updated 11 years ago
- Redis search and indexing in Java☆16Sep 26, 2016Updated 9 years ago
- Examples of using SparklingPandas and Pandas with PySpark☆16Aug 6, 2015Updated 10 years ago
- Infinify (public ver.) is an AI-powered SaaS application that lets you enjoy various features such as AI chat, image generation, image ed…☆11Apr 15, 2025Updated 10 months ago
- 优化flink的多流操作(例如join),优化点不限于数据丢失问题,以及性能问题☆11Apr 8, 2019Updated 6 years ago