netxillon / HadoopLinks
Hadoop Cluster Configurations
☆33Updated 3 years ago
Alternatives and similar repositories for Hadoop
Users that are interested in Hadoop are comparing it to the libraries listed below
Sorting:
- ☆54Updated 10 years ago
- ☆105Updated 5 years ago
- InputFormat that can split multi-line JSON☆49Updated 10 years ago
- Code to index HDFS to Solr using MapReduce☆52Updated 6 years ago
- Reads a HBase table and writes the out as Text, Seq, Avro, or Parquet☆28Updated 11 years ago
- HDF masterclass materials☆28Updated 9 years ago
- Code for Tutorial on designing clickstream analytics application using Hadoop☆54Updated 10 years ago
- Workshops on how to setup security on Hadoop using HDP sandboxes☆100Updated 7 years ago
- Visualize your HDFS cluster usage☆229Updated 4 years ago
- Collection of tools for bootstrapping Apache Ambari & deploying clusters☆83Updated 6 years ago
- An Apache access log parser written in Scala☆72Updated 4 years ago
- Code repository for O'Reilly Hadoop Application Architectures book☆165Updated 10 years ago
- UberScriptQuery, a SQL-like DSL to make writing Spark jobs super easy☆61Updated last year
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- Testbench for experimenting with Apache Hive at any data scale.☆64Updated 8 years ago
- ansible playbook to deploy cloudera hadoop components to the cluster☆52Updated 6 years ago
- An example Apache Beam project.☆111Updated 8 years ago
- ☆23Updated 7 years ago
- Collection of Pig scripts that I use for my talks and workshops☆39Updated 12 years ago
- Integrate Grafana with Ambari Metrics System☆27Updated last month
- Sample Spark Streaming application for secure consumption from Kafka☆33Updated 8 years ago
- Big Data ETL and Utilities for Hadoop Map Reduce, Spark and Storm☆102Updated last year
- ☆48Updated 7 years ago
- Simple examle for Spark Streaming over Kafka topic☆106Updated 4 years ago
- Ambari Service definition for deploying R & RHadoop libraries☆18Updated 9 years ago
- ☆14Updated last month
- Spark structured streaming with Kafka data source and writing to Cassandra☆62Updated 5 years ago
- Docker Image and Kubernetes Configurations for Spark 2.x☆41Updated 5 years ago
- Lightweight proxy to expose the UI of an Apache Spark cluster that is behind a firewall☆98Updated 5 years ago
- Reference architecture for real-time stream processing with Apache Flink on Amazon EMR, Amazon Kinesis, and Amazon Elasticsearch Service.☆72Updated last year