walmartlabs/mupd8

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/walmartlabs/mupd8)

walmartlabs / mupd8

Muppet

☆128

Alternatives and similar repositories for mupd8

Users that are interested in mupd8 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

apache / incubator-retired-s4
View on GitHub
Mirror of Apache S4
☆44Dec 10, 2018Updated 7 years ago
ottogroup / SPQR
View on GitHub
Spooker is a dynamic framework for processing high volume data streams via processing pipelines
☆30Feb 1, 2016Updated 10 years ago
JoshuaFox / Spark-Cassandra-Collabfiltering
View on GitHub
Collaborative filtering with MLLib on Spark based on data in Cassandra
☆21Mar 11, 2022Updated 4 years ago
YahooArchive / samoa
View on GitHub
SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams.
☆427Mar 28, 2016Updated 10 years ago
project-z / mutton
View on GitHub
The core bitmapping indexing code for project-z
☆21Jul 16, 2013Updated 13 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
facebookarchive / hive-io-experimental
View on GitHub
Hive I/O Library
☆67Oct 28, 2021Updated 4 years ago
brightcove-archive / ooyala_scamr
View on GitHub
A Hadoop map reduce framework for Scala.
☆15Apr 21, 2016Updated 10 years ago
sujitpal / hia-examples
View on GitHub
Hadoop In Action Examples
☆40Apr 26, 2021Updated 5 years ago
alienrobotwizard / sounder
View on GitHub
A grouping of Apache Pig examples.
☆65Oct 13, 2020Updated 5 years ago
cdapio / tigon
View on GitHub
High Throughput Real-time Stream Processing Framework
☆284Apr 5, 2017Updated 9 years ago
sritchie / summingbird-workshop
View on GitHub
Summingbird Workshop at Lambda Jam 2013.
☆24Aug 21, 2018Updated 7 years ago
spotify / crunch-lib
View on GitHub
Useful reusable pipeline components for Crunch jobs
☆27Feb 10, 2015Updated 11 years ago
amplab / shark
View on GitHub
Development in Shark has been ended.
☆992Aug 11, 2015Updated 10 years ago
LinkedInAttic / datafu
View on GitHub
Hadoop library for large-scale data processing, now an Apache Incubator project
☆581Jul 8, 2014Updated 12 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
cutting / trevni
View on GitHub
a column file format
☆133Sep 25, 2012Updated 13 years ago
lintool / Cloud9
View on GitHub
Cloud9 is a Hadoop toolkit for working with big data
☆237Dec 15, 2015Updated 10 years ago
yahoo / storm-yarn
View on GitHub
Storm-yarn enables Storm clusters to be deployed into machines managed by Hadoop YARN.
☆417Jul 21, 2023Updated 3 years ago
sriksun / Ivory
View on GitHub
Data Management + Feed Processing Platform over Hadoop
☆27May 8, 2013Updated 13 years ago
tdunning / pig-vector
View on GitHub
Mahout vector encoding for pig
☆53Nov 20, 2022Updated 3 years ago
facebookarchive / hadoop-20
View on GitHub
Facebook's Realtime Distributed FS based on Apache Hadoop 0.20-append
☆874Oct 10, 2014Updated 11 years ago
forcedotcom / phoenix
View on GitHub
☆558Feb 12, 2022Updated 4 years ago
nathanmarz / elephantdb
View on GitHub
Distributed database specialized in exporting key/value data from Hadoop
☆558Jun 27, 2014Updated 12 years ago
ThinkBigAnalytics / Hive-Extensions-from-Think-Big-Analytics
View on GitHub
Reusable code for Hive
☆16Aug 19, 2014Updated 11 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
LinkedInAttic / Cubert
View on GitHub
Fast and efficient batch computation engine for complex analysis and reporting of massive datasets on Hadoop
☆245Aug 24, 2015Updated 10 years ago
sujitpal / mia-scala-examples
View on GitHub
Mahout Examples
☆26Aug 2, 2016Updated 9 years ago
trulia / thoth
View on GitHub
Thoth is a real-time solr monitor and search analysis engine. It's a set of tools that can help you collect, visualize and leverage data …
☆71Dec 17, 2014Updated 11 years ago
tdunning / Plume
View on GitHub
Explorations relative to cloning FlumeJava
☆94Oct 13, 2020Updated 5 years ago
jpatanooga / Caduceus
View on GitHub
Set of example algorithm implementations focused on statistics and machine learning
☆31Apr 11, 2011Updated 15 years ago
twitter-archive / elephant-twin
View on GitHub
Elephant Twin is a framework for creating indexes in Hadoop
☆99Oct 12, 2020Updated 5 years ago
twitter-archive / clockworkraven
View on GitHub
Human-Powered Data Analysis with Mechanical Turk
☆300Nov 28, 2012Updated 13 years ago
square / cascading-helpers
View on GitHub
A whole bunch of functions, filters, and other tools that make writing Cascading flows a joy
☆55Mar 19, 2023Updated 3 years ago
LinkedInAttic / white-elephant
View on GitHub
Hadoop log aggregator and dashboard
☆190Oct 29, 2013Updated 12 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
cloudera / kitten
View on GitHub
The fast and fun way to write YARN applications.
☆136Nov 14, 2018Updated 7 years ago
palantir / Sysmon
View on GitHub
A lightweight platform monitoring tool for Java VMs
☆161Dec 6, 2016Updated 9 years ago
kj-ki / tpc-h-impala
View on GitHub
TPC-H Benchmark on Cloudera Impala
☆19Apr 25, 2013Updated 13 years ago
twitter-archive / ambrose
View on GitHub
A platform for visualization and real-time monitoring of data workflows
☆1,170Jan 22, 2020Updated 6 years ago
kijiproject / kiji-bento
View on GitHub
Kiji BentoBox: Developer SDK for Kiji including a standalone zero-configuration HBase micro-cluster
☆25Sep 26, 2014Updated 11 years ago
Netflix / zeno
View on GitHub
Netflix's In-Memory Data Propagation Framework
☆200Mar 4, 2024Updated 2 years ago
stripe-archive / herringbone
View on GitHub
Tools for working with parquet, impala, and hive
☆135Jan 4, 2021Updated 5 years ago