LinkedInAttic / apache-incubator-gobblin
Gobblin is a distributed big data integration framework (ingestion, replication, compliance, retention) for batch and streaming systems. Gobblin features integrations with Apache Hadoop, Apache Kafka, Salesforce, S3, MySQL, Google etc.
☆11Updated 7 years ago
Alternatives and similar repositories for apache-incubator-gobblin:
Users that are interested in apache-incubator-gobblin are comparing it to the libraries listed below
- Temporal_Graph_library☆25Updated 6 years ago
- Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…☆70Updated 2 years ago
- A template-based cluster provisioning system☆61Updated 2 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- Preliminary Solr DQ / Data Quality experiments and prototype, and SolrJ wrapper utilities☆26Updated 3 months ago
- UberScriptQuery, a SQL-like DSL to make writing Spark jobs super easy☆61Updated last year
- Fast and scalable timeseries database☆25Updated 4 years ago
- An example of building kubernetes operator (Flink) using Abstract operator's framework☆26Updated 5 years ago
- Demonstration of a Hive Input Format for Iceberg☆26Updated 4 years ago
- Node.js kafka connect connector for prometheus☆12Updated 2 years ago
- A distributed generic query layer for Apache Kafka Interactive Queries☆26Updated 7 years ago
- Demo quering counts of a event stream with Apache Flink☆23Updated 6 years ago
- An application that records stats about consumer group offset commits and reports them as prometheus metrics☆14Updated 6 years ago
- Example using Grafana with Druid☆11Updated 10 years ago
- ☆26Updated 5 years ago
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Updated 9 years ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 4 years ago
- Cloudbreak Deployer Tool☆34Updated last year
- Cascading on Apache Flink®☆54Updated last year
- Flink Examples☆39Updated 9 years ago
- Ambari stack service for easily installing and managing Solr on HDP cluster☆19Updated 6 years ago
- A shim for using Cassandra as a backend for OpenTSDB. Not to be used as a general Cassandra client.☆7Updated 6 years ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated last year
- Stocks -> NiFi -> Kafka -> Profit☆14Updated 6 years ago
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 11 years ago
- Reactive Outlier Detection Engine☆11Updated 10 years ago
- Serializing / deserializing library for AWS objects☆46Updated 4 months ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆29Updated 9 years ago
- Provides a Pythonic interface for reading and writing Avro schemas☆27Updated 2 years ago
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Updated 2 years ago