YahooArchive/howl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YahooArchive/howl)

YahooArchive / howl

Common metadata layer for Hadoop's Map Reduce, Pig, and Hive

☆77

Alternatives and similar repositories for howl

Users that are interested in howl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

anthonyu / Sizzle
View on GitHub
A compiler and runtime for Google's Sawzall language, optimized for Hadoop
☆41Apr 26, 2013Updated 13 years ago
YahooArchive / oozie
View on GitHub
Oozie - workflow engine for Hadoop
☆373Jun 8, 2017Updated 9 years ago
rjurney / Cloud-Stenography
View on GitHub
Main Repo
☆15Jun 24, 2010Updated 16 years ago
cloudera / bigtop
View on GitHub
Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. The primary goal of Bigtop is to build a …
☆51Jul 4, 2011Updated 15 years ago
tdunning / Plume
View on GitHub
Explorations relative to cloning FlumeJava
☆94Oct 13, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
twitter / elephant-bird
View on GitHub
Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.
☆1,134Apr 10, 2023Updated 3 years ago
spullara / havrobase
View on GitHub
Use Avro to store all your values in HBase instead of regular columns
☆76Dec 1, 2017Updated 8 years ago
Cascading / cascading-dbmigrate
View on GitHub
Tool to help users migrate large relational databases into Hadoop clusters.
☆67Mar 23, 2012Updated 14 years ago
s4 / core
View on GitHub
S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop ap…
☆233Mar 4, 2011Updated 15 years ago
tdunning / pig-vector
View on GitHub
Mahout vector encoding for pig
☆53Nov 20, 2022Updated 3 years ago
romainr / PigEditor
View on GitHub
Eclipse plugin for Apache Pig
☆33Jul 22, 2013Updated 13 years ago
iconara / piglet
View on GitHub
Piglet is a DSL for writing Pig scripts in Ruby
☆83Jul 21, 2010Updated 16 years ago
ogrisel / pignlproc
View on GitHub
Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.
☆163Nov 8, 2022Updated 3 years ago
dvryaboy / pig
View on GitHub
Mirror of Apache Pig
☆18Jul 9, 2013Updated 13 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
cloudera / flume
View on GitHub
WE HAVE MOVED to Apache Incubator. https://cwiki.apache.org/FLUME/ . Flume is a distributed, reliable, and available service for effici…
☆943May 26, 2021Updated 5 years ago
jaxlaw / fairy
View on GitHub
esper made easy
☆15Jul 6, 2022Updated 4 years ago
akkumar / hbasene
View on GitHub
HBase as the backing store for the TF-IDF representations for Lucene
☆110May 14, 2010Updated 16 years ago
mattb / pig-redis
View on GitHub
Redis bulk-loader for Apache Pig
☆40Apr 21, 2012Updated 14 years ago
julienledem / Pig-scripting-examples
View on GitHub
Examples of use of pig scripting languages capabilities
☆39Aug 1, 2016Updated 9 years ago
toddlipcon / gremlins
View on GitHub
Gremlins is a python framework for fault-testing distributed systems
☆123May 12, 2014Updated 12 years ago
basho / innostore
View on GitHub
Innostore is a simple Erlang API to Embedded InnoDB.
☆32Feb 17, 2012Updated 14 years ago
wilbur / Piggybank
View on GitHub
A reporistory of User-defined functions for Apache Pig
☆16Sep 20, 2010Updated 15 years ago
mesos / mesos
View on GitHub
PLEASE NOTE: Mesos is now hosted in Apache git! Get it using git clone https://git-wip-us.apache.org/repos/asf/mesos.git
☆416Jan 22, 2018Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
kevinweil / stream-to-hdfs
View on GitHub
A simple utility for streaming stdin to a file in HDFS
☆25Feb 4, 2010Updated 16 years ago
twitter-archive / pycascading
View on GitHub
A Python wrapper for Cascading
☆220Dec 30, 2019Updated 6 years ago
membase / libconflate
View on GitHub
A library for managing configuration of clustered applications -- Bringing it all together.
☆26Mar 20, 2014Updated 12 years ago
codahale / shore
View on GitHub
[ABANDONED] What makes Jersey fun.
☆16Aug 20, 2010Updated 15 years ago
twitter / hadoop-lzo
View on GitHub
Refactored version of code.google.com/hadoop-gpl-compression for hadoop 0.20
☆548Apr 24, 2024Updated 2 years ago
flumebase / flumebase
View on GitHub
Continuous Streaming SQL Queries for Flume
☆96Dec 30, 2011Updated 14 years ago
LinkedInAttic / datafu
View on GitHub
Hadoop library for large-scale data processing, now an Apache Incubator project
☆581Jul 8, 2014Updated 12 years ago
LinkedInAttic / kamikaze
View on GitHub
DocId set compression and set operation library
☆22Mar 7, 2014Updated 12 years ago
jzachr / goldenorb
View on GitHub
GoldenOrb is an open-source implementation of Pregel, Google's graph processing framework
☆293Jun 29, 2022Updated 4 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
sonalgoyal / hiho
View on GitHub
Hadoop Data Integration with various databases, ftp servers, salesforce. Incremental update, dedup, append, merge your data on Hadoop.
☆92Apr 11, 2013Updated 13 years ago
Cascading / cascading
View on GitHub
All development now happens over here: https://github.com/cwensel/cascading. Cascading is a feature rich API for defining and executing c…
☆332Nov 29, 2018Updated 7 years ago
lintool / Cloud9
View on GitHub
Cloud9 is a Hadoop toolkit for working with big data
☆237Dec 15, 2015Updated 10 years ago
ning / Arecibo
View on GitHub
Real-time Monitoring
☆29May 14, 2012Updated 14 years ago
maw / ostrich
View on GitHub
stats collector & reporter for scala servers
☆19Jan 6, 2010Updated 16 years ago
apache / whirr
View on GitHub
Mirror of Apache Whirr
☆96Apr 28, 2017Updated 9 years ago
YahooArchive / messenger-sdk
View on GitHub
Yahoo! Messenger API SDK
☆31Sep 1, 2010Updated 15 years ago