pranab/chombo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pranab/chombo)

pranab / chombo

Big Data ETL and Utilities for Hadoop Map Reduce, Spark and Storm

☆106

Alternatives and similar repositories for chombo

Users that are interested in chombo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

vngrs / spark-etl
View on GitHub
Apache Spark based ETL Engine
☆71Oct 18, 2016Updated 9 years ago
AbsaOSS / spark-hofs
View on GitHub
Scala API for Apache Spark SQL high-order functions
☆15Aug 4, 2023Updated 2 years ago
caroljmcdonald / sparkdataframeexample
View on GitHub
☆21Oct 1, 2015Updated 10 years ago
ExNexu / drools-scala-example
View on GitHub
☆10Apr 10, 2014Updated 12 years ago
amient / affinity
View on GitHub
Library and a Framework for building fast, scalable, fault-tolerant Data APIs based on Akka, Avro, ZooKeeper and Kafka
☆25Oct 16, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
isaac-taylor / BloomFilter
View on GitHub
A simple Bloom Filter implementation in Java
☆16Oct 21, 2012Updated 13 years ago
shengjk / flinksql-platform
View on GitHub
flinksql-platform
☆19Mar 22, 2021Updated 5 years ago
tellapart / TellApart-Hadoop-Utils
View on GitHub
Utilities for working with Hadoop and Cascading
☆19Feb 8, 2011Updated 15 years ago
GoogleCloudPlatform / spark-examples
View on GitHub
Spark pipelines that correspond to a series of Dataflow examples.
☆27May 5, 2019Updated 7 years ago
AbsaOSS / hyperdrive
View on GitHub
Extensible streaming ingestion pipeline on top of Apache Spark
☆47Jul 17, 2025Updated last year
bluecolor / octopus
View on GitHub
Open source task scheduler with dependency management
☆15Jul 1, 2018Updated 8 years ago
spoddutur / spark-streaming-monitoring-with-lightning
View on GitHub
Plot live-stats as graph from ApacheSpark application using Lightning-viz
☆18Jul 3, 2017Updated 9 years ago
mshtelma / spark-structured-streaming-jdbc-sink
View on GitHub
Spark Structured Streaming JDBC Sink
☆16Apr 26, 2021Updated 5 years ago
indix / sparkplug
View on GitHub
Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
☆28May 15, 2020Updated 6 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
tomasonjo / zeppelin-graph-algo
View on GitHub
Repository of Notebooks taken from https://neo4j.com/graph-algorithms-book/
☆26Feb 21, 2020Updated 6 years ago
CoxAutomotiveDataSolutions / waimak
View on GitHub
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
☆76Apr 24, 2024Updated 2 years ago
pranab / ruscello
View on GitHub
Real time and offline time series analysis with Spark, Spark Streaming and Storm
☆21Oct 20, 2020Updated 5 years ago
godatadriven / scala-spark-application
View on GitHub
☆32Mar 21, 2018Updated 8 years ago
leandrohmvieira / databricks-crt020-notes
View on GitHub
docs, codes and resources to prepare for the CRT020: Databricks Certified Associate Developer for Apache Spark 2.4 with Python 3 certific…
☆10Sep 25, 2019Updated 6 years ago
smart-data-lake / smart-data-lake
View on GitHub
Smart Automation Tool for building modern Data Lakes and Data Pipelines
☆129Updated this week
Glympse / terraform-provider-nifi
View on GitHub
Terraform provider for interacting with NiFi cluster
☆51May 29, 2019Updated 7 years ago
Azure-Samples / hdinsight-java-hive-jdbc
View on GitHub
An example of how to use the JDBC to issue Hive queries from a Java client application.
☆11Apr 5, 2018Updated 8 years ago
avensolutions / spark-sql-etl-framework
View on GitHub
Multi-stage, config driven, SQL based ETL framework using PySpark
☆26Sep 16, 2019Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hseagle / presto
View on GitHub
Distributed SQL query engine for running interactive analytic queries against big data sources.
☆10Jul 1, 2016Updated 10 years ago
swoop-inc / spark-records
View on GitHub
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
☆73Mar 14, 2021Updated 5 years ago
zengxiaosen / flinkMultiStreamOptimization
View on GitHub
优化flink的多流操作（例如join），优化点不限于数据丢失问题，以及性能问题
☆11Apr 8, 2019Updated 7 years ago
richardanaya / spark_delta_lake
View on GitHub
☆16Jun 27, 2020Updated 6 years ago
tmalaska / CopybookInputFormat
View on GitHub
Using JRecord to build a mapred and mapreduce inputformat for HDFS, MAPREDUCE, PIG, HIVE, Spark, ...
☆19Dec 7, 2017Updated 8 years ago
maropu / spark-sql-server
View on GitHub
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
☆34Sep 8, 2022Updated 3 years ago
FRosner / drunken-data-quality
View on GitHub
Spark package for checking data quality
☆220Feb 28, 2020Updated 6 years ago
criteo / garmadon
View on GitHub
Java event logs collector for hadoop and frameworks
☆42Mar 25, 2025Updated last year
rbheemana / Cobol-to-Hive
View on GitHub
Serde for Cobol Layout to Hive table
☆24Feb 23, 2019Updated 7 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
flux-project / flux
View on GitHub
Machine Learning Stack for Big Data, Big Cluster and Big Challenges
☆22Sep 6, 2018Updated 7 years ago
varunpant / AroundMe
View on GitHub
Spatial search using Elastic Search
☆12Dec 27, 2014Updated 11 years ago
xieenze / SparkOnKudu
View on GitHub
使用spark + kudu的案例
☆15Sep 13, 2017Updated 8 years ago
rubanm / ignite-scala
View on GitHub
Scala API for distributed closures on Apache Ignite
☆11Jun 6, 2015Updated 11 years ago
yamrcraft / etl-light
View on GitHub
A light Kafka to HDFS/S3 ETL library based on Apache Spark
☆40Jun 29, 2017Updated 9 years ago
Azure-Samples / aihlsignited-medindexer
View on GitHub
Indexing framework designed for the automated creation of structured knowledge bases in Azure AI Search
☆15Jul 17, 2026Updated last week
homeaway / datapull
View on GitHub
Cloud based Data Platform based on Apache Spark
☆28Jun 30, 2026Updated 3 weeks ago