michal-harish/kafka-hadoop-loader

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/michal-harish/kafka-hadoop-loader)

michal-harish / kafka-hadoop-loader

Hadoop Job for schemaless incremental loading of messages from Kafka topics onto hdfs with configurable output partitioning.

☆91

Alternatives and similar repositories for kafka-hadoop-loader

Users that are interested in kafka-hadoop-loader are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

blackberry / KaBoom
View on GitHub
A High Performance Cluster Consumer for Kafka that creates Avro (boom) files in Hadoop in time based directory paths
☆41Jun 3, 2016Updated 10 years ago
LiuGuH / kafka-hadoop-loader-my
View on GitHub
kafka0.8.2 using simple consumer load message into hdfs using custom mapreduce
☆12Aug 12, 2015Updated 10 years ago
LinkedInAttic / camus
View on GitHub
LinkedIn's previous generation Kafka to HDFS pipeline.
☆881Aug 27, 2020Updated 5 years ago
miniway / kafka-hadoop-consumer
View on GitHub
Another kafka-hadoop-consumer
☆26Apr 17, 2013Updated 13 years ago
xstevens / syslog-kafka
View on GitHub
INACTIVE: A daemon to transfer syslog messages to Apache Kafka.
☆24Mar 30, 2017Updated 9 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
TopSpoofer / hbrdd
View on GitHub
一个为spark批量导入数据到hbase的库
☆43Nov 18, 2016Updated 9 years ago
RedisLabs / ReSearch
View on GitHub
Redis search and indexing in Java
☆16Sep 26, 2016Updated 9 years ago
twitter / hraven
View on GitHub
hRaven collects run time data and statistics from MapReduce jobs in an easily queryable format
☆129Jan 14, 2022Updated 4 years ago
sinodzh / HadoopExample
View on GitHub
平时玩hadoop做的例子。
☆10Feb 15, 2017Updated 9 years ago
razvan / kafka-s3-consumer
View on GitHub
Store batched Kafka messages in S3.
☆39Apr 13, 2022Updated 4 years ago
brndnmtthws / kafka-on-marathon
View on GitHub
Scripts for running Apache Kafka on Mesosphere's Marathon
☆14Dec 6, 2015Updated 10 years ago
divolte / divolte-kafka-consumer
View on GitHub
Helper for consuming Divolte events from Kafka queues and deserializing Avro records into Java objects using Avro's generated code.
☆15Nov 6, 2014Updated 11 years ago
kijiproject / kiji-express
View on GitHub
☆16Sep 26, 2014Updated 11 years ago
ganghuawang / java-redis-rdb
View on GitHub
Parse Redis dump.rdb file
☆31Aug 30, 2016Updated 9 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
anjuke / hwi
View on GitHub
Hive Web Interface
☆30Apr 29, 2014Updated 12 years ago
edwardcapriolo / filecrush
View on GitHub
Remedy small files by combining them into larger ones.
☆196Jul 1, 2022Updated 4 years ago
kijiproject / kiji
View on GitHub
The Kiji project suite
☆33Jun 18, 2015Updated 11 years ago
netmelody / clj-statsd-svr
View on GitHub
a statsd server implemented in clojure
☆20Dec 3, 2015Updated 10 years ago
LinkedInAttic / white-elephant
View on GitHub
Hadoop log aggregator and dashboard
☆190Oct 29, 2013Updated 12 years ago
LinkedInAttic / Cubert
View on GitHub
Fast and efficient batch computation engine for complex analysis and reporting of massive datasets on Hadoop
☆245Aug 24, 2015Updated 10 years ago
confluentinc / kafka-connect-hdfs
View on GitHub
Kafka Connect HDFS connector
☆27Updated this week
apache / gobblin
View on GitHub
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, orga…
☆2,270Jun 24, 2026Updated 3 weeks ago
dstreev / hdp-data-gen
View on GitHub
Hortonworks Data Platform Data Generation Tool
☆13Nov 30, 2017Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sigmoidanalytics / spork
View on GitHub
Pig on Apache Spark
☆82Mar 23, 2015Updated 11 years ago
harelba / hadoop-job-analyzer
View on GitHub
☆29Nov 17, 2014Updated 11 years ago
mozilla-metrics / akela
View on GitHub
A bunch of utility classes for Java, Hadoop, HBase, Pig, etc.
☆77Mar 31, 2014Updated 12 years ago
researchgate / azkaban-ldap-usermanager
View on GitHub
Ldap authentication for Azkaban
☆24Apr 7, 2023Updated 3 years ago
twitter-archive / elephant-twin
View on GitHub
Elephant Twin is a framework for creating indexes in Hadoop
☆99Oct 12, 2020Updated 5 years ago
xianglei / phpHiveAdmin
View on GitHub
An Apache Hive management system
☆84Jul 21, 2015Updated 11 years ago
akkumar / maven-hadoop
View on GitHub
Maven Plugin to submit hadoop jobs
☆22Dec 17, 2023Updated 2 years ago
sriksun / Ivory
View on GitHub
Data Management + Feed Processing Platform over Hadoop
☆27May 8, 2013Updated 13 years ago
opskeleton / storm-sandbox
View on GitHub
A sandbox for running storm
☆17Nov 26, 2013Updated 12 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
gbraccialli / SparkUtils
View on GitHub
☆11Dec 10, 2015Updated 10 years ago
EmberAds / trifle
View on GitHub
A GeoIP country lookup in Redis
☆16Jan 22, 2013Updated 13 years ago
facebookarchive / presto-odbc
View on GitHub
Presto ODBC Driver
☆27Oct 8, 2014Updated 11 years ago
tuplejump / snackfs
View on GitHub
HDFS compatible Distributed Filesystem backed Cassandra
☆25Sep 17, 2015Updated 10 years ago
leonchen83 / redis-cluster-watchdog
View on GitHub
pretend as a redis cluster node which accept RCmb(Redis Cluster message bus) message and play with redis cluster
☆13Sep 4, 2025Updated 10 months ago
Banno / druid-docker
View on GitHub
Docker containers for Druid nodes
☆28Jun 30, 2016Updated 10 years ago
forward / flume-zeromq
View on GitHub
A zeromq sink for flume
☆15Jun 16, 2011Updated 15 years ago