hadooparchitecturebook/hadoop-arch-book

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hadooparchitecturebook/hadoop-arch-book)

hadooparchitecturebook / hadoop-arch-book

Code repository for O'Reilly Hadoop Application Architectures book

☆160

Alternatives and similar repositories for hadoop-arch-book

Users that are interested in hadoop-arch-book are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hadooparchitecturebook / clickstream-tutorial
View on GitHub
Code for Tutorial on designing clickstream analytics application using Hadoop
☆54May 20, 2015Updated 11 years ago
hadooparchitecturebook / SparkStreaming.Sessionization
View on GitHub
NRT Sessionization with Spark Streaming landing on HDFS and putting live stats in HBase
☆16Oct 31, 2014Updated 11 years ago
hadooparchitecturebook / fraud-detection-tutorial
View on GitHub
☆47May 11, 2016Updated 10 years ago
alexholmes / hadoop-book
View on GitHub
Source code to accompany the book "Hadoop in Practice", published by Manning.
☆202Feb 11, 2020Updated 6 years ago
ArchitectingHBase / examples
View on GitHub
Will come later...
☆20Jul 1, 2022Updated 4 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
hortonworks-gallery / ambari-vnc-service
View on GitHub
An Ambari Stack service package for VNC Server with the ability to install developer tools like Eclipse/IntelliJ/Maven as well to 'remote…
☆28Aug 18, 2016Updated 9 years ago
tomwhite / hadoop-book
View on GitHub
Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White
☆3,500Mar 17, 2020Updated 6 years ago
alexholmes / hiped2
View on GitHub
Source code that accompanies the book "Hadoop in Practice, Second Edition".
☆80Sep 10, 2014Updated 11 years ago
tmalaska / Spark.TableStatsExample
View on GitHub
Simple Spark example of generating table stats for use of data quality checks
☆27Apr 28, 2017Updated 9 years ago
wypb / spark-summit-east-2017
View on GitHub
☆30Jun 18, 2017Updated 9 years ago
gwenshap / SparkStreamingExample
View on GitHub
☆55Aug 21, 2014Updated 11 years ago
PacktPublishing / Practical-Real-time-Processing-and-Analytics
View on GitHub
Practical Real-Time Data Processing and Analytics, published by Packt
☆13Jan 14, 2021Updated 5 years ago
larsgeorge / maven-archetype-hadoop
View on GitHub
Provides a simple archetype to create MapReduce jobs with Maven.
☆24Dec 3, 2010Updated 15 years ago
alexvk / ml-in-scala
View on GitHub
☆11Jun 22, 2016Updated 10 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
alexholmes / hadoop-utils
View on GitHub
A set of Hadoop utilities to make working with Hadoop a little easier.
☆26Feb 11, 2020Updated 6 years ago
randerzander / HiveToPhoenix
View on GitHub
An Apache Spark app for making data movement between Apache Hive and Apache Phoenix/HBase
☆14Mar 23, 2016Updated 10 years ago
tmalaska / HBase-ToHDFS
View on GitHub
Reads a HBase table and writes the out as Text, Seq, Avro, or Parquet
☆28May 15, 2014Updated 12 years ago
hadooparchitecturebook / Taxi360
View on GitHub
☆21Apr 17, 2023Updated 3 years ago
seanorama / workshop-hadoop-ops
View on GitHub
Workshop for Hadoop Operations Best Practices
☆10Feb 24, 2015Updated 11 years ago
alienrobotwizard / varaha
View on GitHub
Machine learning and natural language processing with Apache Pig
☆53Dec 17, 2013Updated 12 years ago
ZubairNabi / prosparkstreaming
View on GitHub
Code used in "Pro Spark Streaming: The Zen of Real-time Analytics using Apache Spark" published by Apress Publishing.
☆48Mar 27, 2016Updated 10 years ago
pradeep-pasupuleti / pig-design-patterns
View on GitHub
This repository contains the Pig Latin scripts, UDFs and datasets used in the book Pig Design Patterns by Pradeep Pasupuleti, published b…
☆23Apr 9, 2014Updated 12 years ago
abajwa-hw / security-workshops
View on GitHub
Workshops on how to setup security on Hadoop using HDP sandboxes
☆99Apr 11, 2018Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jwills / geojson
View on GitHub
Scala library for working with GeoJSON records using Esri's Geometry API for Java
☆29Dec 1, 2014Updated 11 years ago
prodriguezdefino / content-dicovery-platform-gcp
View on GitHub
A content discovery platform powered by LLMs
☆12Jul 7, 2025Updated last year
databricks / spark-knowledgebase
View on GitHub
Spark Knowledge Base
☆333Oct 1, 2020Updated 5 years ago
wypb / spark-summit-2017-Europe
View on GitHub
Spark summit 2017 europe ppt下载
☆43Nov 2, 2017Updated 8 years ago
cloudera / emailarchive
View on GitHub
Hadoop for archiving email
☆23Sep 29, 2011Updated 14 years ago
rojopolis / Take-Terraform-to-the-Next-Level
View on GitHub
Examples and demos for the course 'Take Terraform to the Next Level' (https://learning.oreilly.com/search/?query=take%20terraform%20to%20…
☆23Apr 12, 2024Updated 2 years ago
Cascading / CoPA
View on GitHub
Cascading plus City of Palo Alto open data
☆29Mar 3, 2013Updated 13 years ago
mahmoudparsian / data-algorithms-book
View on GitHub
MapReduce, Spark, Java, and Scala for Data Algorithms Book
☆1,081Oct 14, 2024Updated last year
t3rmin4t0r / notes
View on GitHub
Random implementation notes
☆34Apr 23, 2013Updated 13 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ThinkBigAnalytics / Hive-Extensions-from-Think-Big-Analytics
View on GitHub
Reusable code for Hive
☆16Aug 19, 2014Updated 11 years ago
mraad / hex-trips
View on GitHub
Spark, Cassandra, Tessellation and ArcGIS
☆10Jan 18, 2015Updated 11 years ago
randerzander / jupyter-service
View on GitHub
Ambari Service definition for an Jupyter (IPython3) Notebook service
☆41Jul 1, 2016Updated 10 years ago
high-performance-spark / high-performance-spark-examples
View on GitHub
Examples for High Performance Spark
☆532May 3, 2026Updated 2 months ago
elephantscale / hadoop-book
View on GitHub
'Hadoop illuminated' hadoop book
☆172Aug 12, 2022Updated 3 years ago
jmarkham / yarn-book
View on GitHub
Code samples for the book
☆39Sep 10, 2013Updated 12 years ago
sequenceiq / sequenceiq-samples
View on GitHub
SequenceIQ Hadoop examples
☆114Oct 26, 2015Updated 10 years ago