hadoopecosystemtable / hadoopecosystemtable.github.ioLinks

This page is a summary to keep the track of Hadoop related projects, and relevant projects around Big Data scene focused on the open source, free software environment.

☆690

Alternatives and similar repositories for hadoopecosystemtable.github.io

Users that are interested in hadoopecosystemtable.github.io are comparing it to the libraries listed below

Sorting:

kite-sdk / kite
Kite SDK
☆393Updated 3 years ago
ercoppa / HadoopInternals
Diagrams describing Apache Hadoop internals (2.3.0 or later).
☆430Updated 6 years ago
TIBCOSoftware / snappydata
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…
☆1,035Updated 3 years ago
zenkay / bigdata-ecosystem
BigData Ecosystem Dataset
☆577Updated 4 years ago
haifengl / bigdata
Introduction to Big Data
☆396Updated last year
sameeragarwal / blinkdb
BlinkDB: Sub-Second Approximate Queries on Very Large Data.
☆659Updated 12 years ago
mhausenblas / lambda-architecture.net
A repository of information, examples and good practices around the Lambda Architecture
☆369Updated 8 years ago
apache / apex-core
Mirror of Apache Apex core
☆350Updated 4 years ago
Big-Data-Manning / big-data-code
Source code for Big Data: Principles and best practices of scalable realtime data systems
☆333Updated last year
hbutani / spark-druid-olap
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit…
☆281Updated 7 years ago
airbnb / reair
ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.
☆282Updated 6 years ago
twitter-archive / ambrose
A platform for visualization and real-time monitoring of data workflows
☆1,171Updated 6 years ago
KylinOLAP / Kylin
This code base is retained for historical interest only, please visit Apache Incubator Repo for latest one
☆560Updated 3 years ago
bigdatafoundation / docker-hadoop
Dockerfile for running Hadoop on Ubuntu
☆93Updated 2 years ago
romainr / hadoop-tutorials-examples
Source, data and turotials of the blog post video series of Hue, the Web UI for Hadoop.
☆235Updated 9 years ago
cloudera / livy
Livy is an open source REST interface for interacting with Apache Spark from anywhere
☆1,007Updated 3 years ago
hortonworks-gallery / zeppelin-notebooks
Gallery of Apache Zeppelin notebooks
☆216Updated 6 years ago
Netflix / Lipstick
Pig Visualization framework
☆466Updated 2 years ago
mesos / myriad
https://github.com/apache/incubator-myriad is our new home. See
☆253Updated 10 years ago
yahoo / streaming-benchmarks
Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...
☆646Updated 2 years ago
hadooparchitecturebook / hadoop-arch-book
Code repository for O'Reilly Hadoop Application Architectures book
☆163Updated 10 years ago
LinkedInAttic / camus
LinkedIn's previous generation Kafka to HDFS pipeline.
☆883Updated 5 years ago
holdenk / learning-spark-examples
Examples for learning spark
☆332Updated 10 years ago
ZEPL / zeppelin
DEPRECATED. Zeppelin has moved to Apache. Please make pull request there
☆406Updated 8 years ago
dnafrance / vagrant-hadoop-spark-cluster
Vagrant project to spin up a cluster of 4 32-bit CentOS6.5 Linux virtual machines with Hadoop v2.6.0 and Spark v1.1.1
☆124Updated 10 years ago
Netflix / aegisthus
A Bulk Data Pipeline out of Cassandra
☆324Updated 6 years ago
twitter / summingbird
Streaming MapReduce with Scalding and Storm
☆2,131Updated 4 years ago
GoogleCloudPlatform / DataflowJavaSDK
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
☆851Updated 5 years ago
miguno / kafka-storm-starter
[PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streamin…
☆723Updated 3 years ago
DonDebonair / virtual-hadoop-cluster
A virtual Hadoop cluster running CDH5
☆103Updated 10 years ago