Data and example code for Programming Pig, by Alan F. Gates
☆187Oct 15, 2016Updated 9 years ago
Alternatives and similar repositories for programmingpig
Users that are interested in programmingpig are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Collection of Pig scripts that I use for my talks and workshops☆39Apr 30, 2013Updated 13 years ago
- ☆44Jul 24, 2017Updated 8 years ago
- Hadoop library for large-scale data processing, now an Apache Incubator project☆581Jul 8, 2014Updated 11 years ago
- Few scripts to automate daily data loads from RDBMS to Partitioned Avro Hive table☆30Sep 25, 2014Updated 11 years ago
- ☆26Mar 18, 2016Updated 10 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Meta-repository of big data tools -- source and essential plugins for hadoop, pig, wukong, storm, kafka etc.☆30Jun 29, 2014Updated 11 years ago
- Eclipse plugin for Apache Pig☆33Jul 22, 2013Updated 12 years ago
- This is a HOWTO for collecting data in Ruby and Python applications and sending it to S3 via Kafka.☆31Sep 3, 2012Updated 13 years ago
- Tools for analysing and visualising activity around Twitter backchannels☆26Nov 10, 2012Updated 13 years ago
- Apache Pig plugin for Eclipse☆12Feb 28, 2017Updated 9 years ago
- Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. The primary goal of Bigtop is to build a …☆51Jul 4, 2011Updated 14 years ago
- Oozie Samples☆51Jan 11, 2014Updated 12 years ago
- Hadoop Data Integration with various databases, ftp servers, salesforce. Incremental update, dedup, append, merge your data on Hadoop.☆92Apr 11, 2013Updated 13 years ago
- Examples of use of pig scripting languages capabilities☆39Aug 1, 2016Updated 9 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A generator for synthetic streams of financial transactions.☆16Feb 3, 2014Updated 12 years ago
- AI Class Wiki☆11Oct 11, 2011Updated 14 years ago
- All Certification and preparation, examples & others☆11Oct 18, 2018Updated 7 years ago
- Python Client for WebHDFS REST API☆43May 8, 2015Updated 11 years ago
- Machine learning and natural language processing with Apache Pig☆53Dec 17, 2013Updated 12 years ago
- Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.☆1,134Apr 10, 2023Updated 3 years ago
- Piglet is a DSL for writing Pig scripts in Ruby☆83Jul 21, 2010Updated 15 years ago
- Tool to help users migrate large relational databases into Hadoop clusters.☆67Mar 23, 2012Updated 14 years ago
- A simple Scala Based Project Template for Apache Spark☆21Oct 21, 2016Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Repositorios de código para el libro "Introducción a Apache Spark para empezar a programar el Big Data"☆14Nov 22, 2015Updated 10 years ago
- Spider/Parser for gathering the election data from Russian Election Committee website☆16Aug 31, 2015Updated 10 years ago
- Apache Sqoop Cookbook☆36Dec 30, 2013Updated 12 years ago
- Asakusa Framework Examples☆24Jan 7, 2021Updated 5 years ago
- diff large files without running out of memory; only unified format; probably buggy, but ~no memory usage☆14Mar 6, 2014Updated 12 years ago
- Code samples for the book☆39Sep 10, 2013Updated 12 years ago
- Introductory sample scala app using Apache Spark Streaming to accept data from Kafka and write a summary to Cassandra.☆22Dec 5, 2018Updated 7 years ago
- gathering point for open source OCR scripts and diffs☆43Jun 27, 2014Updated 11 years ago
- Functional testing framework for Big Data pipelines.☆59Jul 6, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- All artifacts related to the Hortonworks Data Platform☆19Dec 16, 2022Updated 3 years ago
- Real-time analytics in Apache Flume☆51Feb 2, 2016Updated 10 years ago
- DKPro WSD: A Java framework for word sense disambiguation☆21Nov 16, 2022Updated 3 years ago
- ☆195Jun 21, 2022Updated 3 years ago
- A subscriber endpoint for Flickr's real-time PuSH feed☆28Oct 6, 2018Updated 7 years ago
- sample oozie workflows☆17Jun 13, 2017Updated 9 years ago
- Clojure core.async patterns☆12Jan 30, 2019Updated 7 years ago