Kite SDK
☆393Nov 1, 2022Updated 3 years ago
Alternatives and similar repositories for kite
Users that are interested in kite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Kite SDK Examples☆99May 8, 2021Updated 4 years ago
- A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, orga…☆2,260Updated this week
- Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning☆1,783Aug 16, 2021Updated 4 years ago
- Mirror of Apache Eagle☆410Aug 22, 2020Updated 5 years ago
- Fixed-width data source for Spark SQL and DataFrames☆10Oct 25, 2016Updated 9 years ago
- Mirror of Apache Crunch (Incubating)☆109Feb 2, 2021Updated 5 years ago
- Command line tools for the parquet project☆44Jul 10, 2018Updated 7 years ago
- ☆24Oct 19, 2015Updated 10 years ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆30Feb 1, 2016Updated 10 years ago
- Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.☆1,132Apr 10, 2023Updated 2 years ago
- Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…☆1,035Nov 21, 2022Updated 3 years ago
- Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.☆6,607Updated this week
- Livy is an open source REST interface for interacting with Apache Spark from anywhere☆1,007Oct 5, 2022Updated 3 years ago
- Apache Parquet Java☆3,040Updated this week
- Sparkling Water provides H2O functionality inside Spark cluster☆977Nov 5, 2025Updated 4 months ago
- Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark☆1,369Aug 22, 2023Updated 2 years ago
- Mirror of Apache Slider☆78Dec 11, 2018Updated 7 years ago
- Elasticsearch real-time search and analytics natively integrated with Hadoop☆2,035Mar 12, 2026Updated last week
- Tranquility helps you send real-time event streams to Druid and handles partitioning, replication, service discovery, and schema rollover…☆517Jan 13, 2020Updated 6 years ago
- A small library to add some convenience methods to Scala encompassing predicate logic☆21Mar 16, 2016Updated 10 years ago
- Low level integration of Spark and Kafka☆130Mar 15, 2018Updated 8 years ago
- REST job server for Apache Spark☆2,844Mar 3, 2026Updated 3 weeks ago
- Distributed version restore tool for S3☆12Jan 5, 2015Updated 11 years ago
- Base classes to use when writing tests with Spark☆1,549Dec 22, 2025Updated 3 months ago
- Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies…☆1,113Jan 12, 2023Updated 3 years ago
- Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log-l…☆2,558Oct 10, 2024Updated last year
- ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.☆282Feb 27, 2019Updated 7 years ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆475Apr 18, 2017Updated 8 years ago
- Mirror of Apache DataFu☆122May 20, 2025Updated 10 months ago
- Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit…☆280Aug 3, 2018Updated 7 years ago
- Scripts for generating Grafana dashboards for monitoring Spark jobs☆242Mar 26, 2015Updated 10 years ago
- Alluxio, data orchestration for analytics and machine learning in the cloud☆7,167Apr 29, 2025Updated 10 months ago
- source examples to support the "Cascading for the Impatient" blog post series☆80Aug 30, 2016Updated 9 years ago
- Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.☆352Apr 8, 2025Updated 11 months ago
- Streaming MapReduce with Scalding and Storm☆2,125Jan 19, 2022Updated 4 years ago
- ☆14Jan 12, 2017Updated 9 years ago
- CMAK is a tool for managing Apache Kafka clusters☆11,942Aug 2, 2023Updated 2 years ago
- Serverless proxy for Spark cluster☆325Oct 29, 2020Updated 5 years ago
- Mirror of Apache Toree (Incubating)☆749Mar 9, 2026Updated 2 weeks ago