Data and example code for Programming Pig, by Alan F. Gates
☆186Oct 15, 2016Updated 9 years ago
Alternatives and similar repositories for programmingpig
Users that are interested in programmingpig are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Collection of Pig scripts that I use for my talks and workshops☆39Apr 30, 2013Updated 13 years ago
- This repository contains the Pig Latin scripts, UDFs and datasets used in the book Pig Design Patterns by Pradeep Pasupuleti, published b…☆23Apr 9, 2014Updated 12 years ago
- Hadoop library for large-scale data processing, now an Apache Incubator project☆581Jul 8, 2014Updated 11 years ago
- Few scripts to automate daily data loads from RDBMS to Partitioned Avro Hive table☆30Sep 25, 2014Updated 11 years ago
- ☆26Mar 18, 2016Updated 10 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Apache Pig plugin for Eclipse☆12Feb 28, 2017Updated 9 years ago
- Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. The primary goal of Bigtop is to build a …☆50Jul 4, 2011Updated 14 years ago
- Oozie Samples☆51Jan 11, 2014Updated 12 years ago
- SQL Windowing Functions for Hadoop☆65Jun 20, 2022Updated 3 years ago
- Hadoop Data Integration with various databases, ftp servers, salesforce. Incremental update, dedup, append, merge your data on Hadoop.☆92Apr 11, 2013Updated 13 years ago
- Examples of use of pig scripting languages capabilities☆39Aug 1, 2016Updated 9 years ago
- All Certification and preparation, examples & others☆11Oct 18, 2018Updated 7 years ago
- Python Client for WebHDFS REST API☆43May 8, 2015Updated 11 years ago
- Machine learning and natural language processing with Apache Pig☆53Dec 17, 2013Updated 12 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.☆1,134Apr 10, 2023Updated 3 years ago
- Phoenix database adapter for Python☆16Oct 15, 2019Updated 6 years ago
- Piglet is a DSL for writing Pig scripts in Ruby☆83Jul 21, 2010Updated 15 years ago
- A simple Scala Based Project Template for Apache Spark☆21Oct 21, 2016Updated 9 years ago
- Spider/Parser for gathering the election data from Russian Election Committee website☆16Aug 31, 2015Updated 10 years ago
- Apache Sqoop Cookbook☆36Dec 30, 2013Updated 12 years ago
- diff large files without running out of memory; only unified format; probably buggy, but ~no memory usage☆14Mar 6, 2014Updated 12 years ago
- Code samples for the book☆39Sep 10, 2013Updated 12 years ago
- Spring Design Patterns and Best Practices [video], published by Packt☆13Jan 30, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Real-time analysis and visualization with Storm-AMQ-Camel-Websockets-Highcharts integration.☆26Jan 4, 2015Updated 11 years ago
- Fureteur is a simple, configurable, fault-tolerant web crawler written is Scala☆29Oct 14, 2014Updated 11 years ago
- All artifacts related to the Hortonworks Data Platform☆19Dec 16, 2022Updated 3 years ago
- Computes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.☆26Jan 8, 2015Updated 11 years ago
- Real-time analytics in Apache Flume☆51Feb 2, 2016Updated 10 years ago
- Java version of D.J. Bernstein's constant database (cdb) library.☆17Jan 30, 2026Updated 3 months ago
- sample oozie workflows☆17Jun 13, 2017Updated 8 years ago
- Clojure core.async patterns☆12Jan 30, 2019Updated 7 years ago
- An analysis of adverse drug event data using Hadoop, R, and Gephi☆44Jan 28, 2016Updated 10 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Tail a log file and send log lines automatically to a kafka topic☆19Jun 8, 2015Updated 10 years ago
- A JRuby DSL for Cascading☆41Sep 23, 2015Updated 10 years ago
- Read druid segments from hadoop☆10Jan 18, 2017Updated 9 years ago
- A set of tools for working with Omniture daily data files (hit_data.tsv) in big or small tools like Spark, Hadoop or just Python.☆37May 14, 2019Updated 7 years ago
- Spark Streaming HBase Example☆22Updated this week
- read and write JSON-stat with R☆32Sep 4, 2023Updated 2 years ago
- NCPR storage and api☆22Mar 30, 2018Updated 8 years ago