Data-Intensive Text Processing with MapReduce
☆628Mar 3, 2021Updated 5 years ago
Alternatives and similar repositories for MapReduceAlgorithms
Users that are interested in MapReduceAlgorithms are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository for MapReduce Design Patterns (O'Reilly 2012) example source code☆234Jul 5, 2015Updated 10 years ago
- Taller SparkR para las Jornadas de Usuarios de R☆13Nov 21, 2016Updated 9 years ago
- Spark Tutorial at the University of Maryland☆38Oct 24, 2014Updated 11 years ago
- MapReduce, Spark, Java, and Scala for Data Algorithms Book☆1,081Oct 14, 2024Updated last year
- Mirror of Apache Crunch (Incubating)☆110Feb 2, 2021Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- HSAIL (BRIG) frontend for gcc☆11May 7, 2018Updated 7 years ago
- Example application for analyzing Twitter data using CDH - Flume, Oozie, Hive☆289Aug 25, 2016Updated 9 years ago
- Examples for learning spark☆19Aug 19, 2015Updated 10 years ago
- Useful reusable pipeline components for Crunch jobs☆27Feb 10, 2015Updated 11 years ago
- Provides a simple archetype to create MapReduce jobs with Maven.☆24Dec 3, 2010Updated 15 years ago
- Spark + Jupyer + Hive☆12Sep 24, 2015Updated 10 years ago
- ☆22Sep 20, 2016Updated 9 years ago
- Notes from http://www.ml-class.org/ course tought fall 2011☆27Feb 11, 2020Updated 6 years ago
- Fork of Microsoft/LightGBM to include support for the CEGB (Cost Efficient Gradient Boosting) algorithm. Original repository at https://g…☆13Jun 30, 2017Updated 8 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Sparking Using Java8☆17Feb 28, 2015Updated 11 years ago
- crumbling large graphs into connected components☆12Jan 8, 2018Updated 8 years ago
- Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.☆1,134Apr 10, 2023Updated 3 years ago
- A playground for market-basket analysis.☆13Dec 8, 2022Updated 3 years ago
- A Java library for computing and comparing Nilsimsa string similarity hashes.☆11May 24, 2022Updated 3 years ago
- ☆37Mar 31, 2017Updated 9 years ago
- ☆15Aug 5, 2016Updated 9 years ago
- Generate word-word similarities from Gensim's latent semantic indexing (Python)☆11Jan 10, 2017Updated 9 years ago
- This is a port of the Google+ iPad app timeline purely done with CSS3☆88Aug 2, 2012Updated 13 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White☆3,507Mar 17, 2020Updated 6 years ago
- Workshop for Hadoop Operations Best Practices☆10Feb 24, 2015Updated 11 years ago
- Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even whe…☆55May 21, 2024Updated last year
- A web scraping tutorial with Illinois Department of Revenue tax data☆26May 8, 2013Updated 12 years ago
- R library for common information retrieval metrics☆14Jun 5, 2023Updated 2 years ago
- CS 489/698 Big Data Infrastructure (Winter 2016) at the University of Waterloo☆39Mar 31, 2016Updated 10 years ago
- Simple and unified PHP abstraction library for payment gateway integration☆23Oct 6, 2011Updated 14 years ago
- Source code that accompanies the book "Hadoop in Practice, Second Edition".☆80Sep 10, 2014Updated 11 years ago
- Flask-Style URL Patterns for Django☆14Oct 9, 2015Updated 10 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- SAT Live! web site☆11Apr 3, 2026Updated 2 weeks ago
- A collection of efficient utilities for a data scientist.☆41May 7, 2015Updated 10 years ago
- ☆29Jan 23, 2019Updated 7 years ago
- This repository contains code files specifically IPython notebooks for the assignments in the course "Introduction to Big Data with Apach…☆116Aug 8, 2024Updated last year
- NoSQL y_serial Python module – warehouse compressed objects with SQLite☆17Jun 24, 2015Updated 10 years ago
- Code repository for O'Reilly Hadoop Application Architectures book☆163May 26, 2015Updated 10 years ago
- Vector Space Model Framework developed for InPhO☆39May 9, 2025Updated 11 months ago