cdapio / cdap
An open source framework for building data analytic applications.
☆759Updated this week
Related projects ⓘ
Alternatives and complementary repositories for cdap
- Iceberg is a table format for large, slow-moving tabular data☆478Updated last year
- Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit…☆284Updated 6 years ago
- A load balancer / proxy / gateway for prestodb☆358Updated 3 months ago
- Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…☆1,040Updated last year
- Dremio - the missing link in modern data☆1,376Updated 2 weeks ago
- Tranquility helps you send real-time event streams to Druid and handles partitioning, replication, service discovery, and schema rollover…☆517Updated 4 years ago
- Data Lineage Tracking And Visualization Solution☆604Updated this week
- A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…☆297Updated 9 months ago
- Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies…☆1,110Updated last year
- Cask Hydrator Plugins Repository☆66Updated 2 weeks ago
- Qubole Sparklens tool for performance tuning Apache Spark☆568Updated 4 months ago
- Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.☆886Updated last month
- Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.☆280Updated this week
- ☆1,609Updated this week
- Apache Drill is a distributed MPP query layer for self describing data☆1,943Updated 2 weeks ago
- Generic Data Ingestion & Dispersal Library for Hadoop☆479Updated last year
- ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.☆280Updated 5 years ago
- StreamLine - Streaming Analytics☆164Updated last year
- SQL-based streaming analytics platform at scale☆1,224Updated 4 years ago
- Apache Tez☆479Updated this week
- A simplified, lightweight ETL Framework based on Apache Spark☆584Updated 9 months ago
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.☆342Updated 5 months ago
- Build configuration-driven ETL pipelines on Apache Spark☆158Updated 2 years ago
- A Spark Atlas connector to track data lineage in Apache Atlas☆264Updated last year
- Mirror of Apache Apex core☆350Updated 3 years ago
- Mirror of Apache Bahir☆337Updated last year
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,031Updated this week
- Schema Registry☆15Updated 4 months ago