linkedin / data-integration-library
The Data Integration Library project provides a library of generic components based on a multi-stage architecture for data ingress and egress.
☆32Updated 2 months ago
Alternatives and similar repositories for data-integration-library:
Users that are interested in data-integration-library are comparing it to the libraries listed below
- An Example Dremio ARP driven connector that supports SQLLite☆19Updated last year
- Data abstraction, storage, discovery, and serving system☆32Updated last month
- LinkedIn's version of Apache Calcite☆22Updated 5 months ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 4 years ago
- Apache Daffodil☆93Updated 2 weeks ago
- ☆14Updated 2 months ago
- A Python Client for Hive Metastore☆12Updated last year
- ☆18Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated this week
- Dione - a Spark and HDFS indexing library☆52Updated last year
- Apache NLPCraft - API to convert natural language into actions.☆79Updated 2 months ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆29Updated 9 years ago
- Mirror of Apache Arrow☆32Updated last month
- Mirror of Apache NiFi Flow Design System☆45Updated last year
- Transporter for integrating OpenLineage with OpenMetadata☆13Updated this week
- Drools processor for Apache NiFi☆38Updated 5 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- A library for strong, schema based conversion between 'natural' JSON documents and Avro☆18Updated last year
- Pipeline library for StreamSets Data Collector and Transformer☆33Updated 2 years ago
- Connect DBVisualizer to Hortonwork HiveServer2☆9Updated 10 years ago
- Apache Flagon is a suite of comprehensive, thin-client behavioral logging tools☆25Updated this week
- a curated list of awesome lakehouse frameworks, applications, etc☆27Updated 2 months ago
- Serializable ACID transactions on streaming data☆24Updated 2 years ago
- Profiles the data, validates the schema and runs data quality checks and produces a report☆20Updated 5 years ago
- WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging …☆31Updated this week
- Demonstration of a Hive Input Format for Iceberg☆26Updated 4 years ago
- Set of tools for creating backups, compaction and restoration of Apache Kafka® Clusters☆21Updated last week
- CDAP UI☆20Updated last month
- Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…☆70Updated 2 years ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆51Updated last year