linkedin / data-integration-library
The Data Integration Library project provides a library of generic components based on a multi-stage architecture for data ingress and egress.
☆31Updated 2 weeks ago
Alternatives and similar repositories for data-integration-library:
Users that are interested in data-integration-library are comparing it to the libraries listed below
- LinkedIn's version of Apache Calcite☆22Updated 4 months ago
- Demonstration of a Hive Input Format for Iceberg☆26Updated 4 years ago
- Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…☆71Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated this week
- CDAP Kubernetes Operator☆19Updated last week
- Data abstraction, storage, discovery, and serving system☆31Updated 5 months ago
- Wrangler Transform: A DMD system for transforming Big Data☆92Updated this week
- Transporter for integrating OpenLineage with OpenMetadata☆12Updated last year
- CDAP UI☆20Updated last week
- Cloud Storage Connector integrates Apache Pulsar with cloud storage.☆27Updated this week
- ☆39Updated 6 years ago
- An Example Dremio ARP driven connector that supports SQLLite☆19Updated 11 months ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- Pipeline library for StreamSets Data Collector and Transformer☆33Updated 2 years ago
- ☆14Updated last month
- Amundsen Gremlin☆21Updated 2 years ago
- calcite-arrow-sample(WIP)☆13Updated 7 years ago
- Drools processor for Apache NiFi☆38Updated 5 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- Mirror of Apache DataFu☆120Updated last month
- Profiles the data, validates the schema and runs data quality checks and produces a report☆20Updated 5 years ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆94Updated 2 weeks ago
- A library for strong, schema based conversion between 'natural' JSON documents and Avro☆18Updated last year
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- Mirror of Apache Calcite☆11Updated last month
- Serializable ACID transactions on streaming data☆23Updated 2 years ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 4 years ago
- Trino connectors for accessing APIs with an OpenAPI spec☆31Updated this week
- Unity Catalog UI☆40Updated 6 months ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago