kite-sdk/kite

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kite-sdk/kite)

kite-sdk / kite

Kite SDK

☆393

Alternatives and similar repositories for kite

Users that are interested in kite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kite-sdk / kite-examples
View on GitHub
Kite SDK Examples
☆99May 8, 2021Updated 5 years ago
apache / gobblin
View on GitHub
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, orga…
☆2,270Jun 24, 2026Updated last month
apache / eagle
View on GitHub
Mirror of Apache Eagle
☆411Aug 22, 2020Updated 5 years ago
quartethealth / spark-fixedwidth
View on GitHub
Fixed-width data source for Spark SQL and DataFrames
☆10Oct 25, 2016Updated 9 years ago
OryxProject / oryx
View on GitHub
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
☆1,783Aug 16, 2021Updated 4 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
apache / crunch
View on GitHub
Mirror of Apache Crunch (Incubating)
☆110Feb 2, 2021Updated 5 years ago
twitter / elephant-bird
View on GitHub
Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.
☆1,134Apr 10, 2023Updated 3 years ago
wesleypeck / parquet-tools
View on GitHub
Command line tools for the parquet project
☆44Jul 10, 2018Updated 8 years ago
josephxsxn / moya
View on GitHub
Memcached on YARN
☆19Jun 2, 2014Updated 12 years ago
lucidworks / solr-for-datascience
View on GitHub
☆24Oct 19, 2015Updated 10 years ago
ccsevers / scalding-linalg
View on GitHub
Linear algebra routines for Scalding.
☆21May 23, 2013Updated 13 years ago
ottogroup / SPQR
View on GitHub
Spooker is a dynamic framework for processing high volume data streams via processing pipelines
☆30Feb 1, 2016Updated 10 years ago
BertrandDechoux / cascading.learn
View on GitHub
Test driven learning of Cascading.
☆40Feb 11, 2020Updated 6 years ago
apache / zeppelin
View on GitHub
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
☆6,645Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
TIBCOSoftware / snappydata
View on GitHub
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…
☆1,032Nov 21, 2022Updated 3 years ago
cloudera / livy
View on GitHub
Livy is an open source REST interface for interacting with Apache Spark from anywhere
☆1,007Oct 5, 2022Updated 3 years ago
h2oai / sparkling-water
View on GitHub
Sparkling Water provides H2O functionality inside Spark cluster
☆979Nov 5, 2025Updated 8 months ago
apache / parquet-java
View on GitHub
Apache Parquet Java
☆3,069Updated this week
twitter / scalding
View on GitHub
A Scala API for Cascading
☆3,523May 28, 2023Updated 3 years ago
linkedin / dr-elephant
View on GitHub
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
☆1,370Aug 22, 2023Updated 2 years ago
apache / incubator-retired-slider
View on GitHub
Mirror of Apache Slider
☆79Dec 11, 2018Updated 7 years ago
thehydroimpulse / storm-kafka-starter
View on GitHub
Example using Storm + Kafka.
☆18Mar 18, 2013Updated 13 years ago
elastic / elasticsearch-hadoop
View on GitHub
Elasticsearch real-time search and analytics natively integrated with Hadoop
☆1,975Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
druid-io / tranquility
View on GitHub
Tranquility helps you send real-time event streams to Druid and handles partitioning, replication, service discovery, and schema rollover…
☆519Jan 13, 2020Updated 6 years ago
twitter / summingbird
View on GitHub
Streaming MapReduce with Scalding and Storm
☆2,123Jan 19, 2022Updated 4 years ago
tresata / spark-kafka
View on GitHub
Low level integration of Spark and Kafka
☆129Mar 15, 2018Updated 8 years ago
spark-jobserver / spark-jobserver
View on GitHub
REST job server for Apache Spark
☆2,837Mar 3, 2026Updated 4 months ago
alekseyig / spark-submit-deps
View on GitHub
☆14Jan 12, 2017Updated 9 years ago
holdenk / spark-testing-base
View on GitHub
Base classes to use when writing tests with Spark
☆1,553Apr 20, 2026Updated 3 months ago
apache / datafu
View on GitHub
Mirror of Apache DataFu
☆124Jul 9, 2026Updated 2 weeks ago
databricks / spark-csv
View on GitHub
CSV Data Source for Apache Spark 1.x
☆1,057Dec 13, 2018Updated 7 years ago
Teradata / kylo
View on GitHub
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies…
☆1,111Jan 12, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
TheClimateCorporation / S3DistVersions
View on GitHub
Distributed version restore tool for S3
☆12Jan 5, 2015Updated 11 years ago
apache / logging-flume
View on GitHub
Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log-l…
☆2,566Updated this week
LinkedInAttic / camus
View on GitHub
LinkedIn's previous generation Kafka to HDFS pipeline.
☆881Aug 27, 2020Updated 5 years ago
hbutani / spark-druid-olap
View on GitHub
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit…
☆281Aug 3, 2018Updated 7 years ago
Netflix / Lipstick
View on GitHub
Pig Visualization framework
☆466Mar 24, 2023Updated 3 years ago
Alluxio / alluxio
View on GitHub
Alluxio, data orchestration for analytics and machine learning in the cloud
☆7,211Apr 29, 2025Updated last year
amplab / keystone
View on GitHub
Simplifying robust end-to-end machine learning on Apache Spark.
☆473Apr 18, 2017Updated 9 years ago