pentaho / big-data-pluginLinks
Kettle plugin that provides support for interacting within many "big data" projects including Hadoop, Hive, HBase, Cassandra, MongoDB, and others.
☆238Updated this week
Alternatives and similar repositories for big-data-plugin
Users that are interested in big-data-plugin are comparing it to the libraries listed below
Sorting:
- Hadoop Configurations☆50Updated this week
- Apache Kafka consumer step plug-in for Pentaho Kettle☆65Updated 4 years ago
- Flume Source to import data from SQL Databases☆266Updated 4 years ago
- ☆67Updated this week
- Mirror of Apache Phoenix☆73Updated 6 years ago
- Some information about Apache Kylin interaction with Pentaho Mondrian☆328Updated 9 years ago
- 为DataX(https://github.com/alibaba/DataX) 提供远程多语言调用(ThriftServer,HttpServer) 分布式运行(DataX on YARN) 功能☆144Updated last week
- FTP network server is source of events for Apache-flume☆80Updated 7 years ago
- presto hbase connector 组件基于Presto Connector接口规范实现,用来给Presto增加查询HBase的功能。相比其他开源版本的HBase Connector,我们的性能要快10到100倍以上。☆241Updated 2 years ago
- ☆127Updated last week
- This repository trackes the code and files for building docker image with Apache Kylin.☆126Updated 3 years ago
- Apache Kafka producer step plug-in for Pentaho Kettle.☆45Updated 7 years ago
- Atlas官方文档中文版☆69Updated 5 years ago
- Plugins for Azkaban.☆131Updated 6 years ago
- Ambari service for Apache Flink☆126Updated 4 years ago
- flink-parcel compiler tool☆48Updated 5 years ago
- ☆42Updated 6 years ago
- Mirror of Apache Sentry☆119Updated 4 years ago
- work flow schedule☆90Updated 7 years ago
- An ad hoc query service based on the spark sql engine.(基于spark sql引擎的即席查询服务)☆381Updated last year
- DataX 是阿里巴巴集团内被广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、HDFS、Hive、OceanBase、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。☆138Updated 3 years ago
- Apache HBase Operator Tools☆179Updated 6 months ago
- Moonbox is a DVtaaS (Data Virtualization as a Service) Platform☆506Updated 2 years ago
- Full Database Migration Tool based on Alibaba DataX 3.0☆98Updated 5 years ago
- Bireme is an incremental synchronization tool for the Greenplum / HashData data warehouse☆137Updated 3 years ago
- an data-centric integration platform☆48Updated 3 years ago
- Unified SQL Analytics Engine Based on SparkSQL☆210Updated 2 years ago
- datax web。datax中的web配置界面没有集成在一起开源出来,此为web端配置项目。☆100Updated 6 years ago
- Real-time ETL developed by Flink, data from MySQL to Greenplum. Use canal to parse the MySQL binlog, put it into kafka, use Flink to cons…☆79Updated last year
- A Maven-based example of using Cloudera Impala's JDBC driver☆118Updated 9 years ago