Python module that allows one to easily write and run Hadoop programs.
☆1,031Jan 9, 2018Updated 8 years ago
Alternatives and similar repositories for dumbo
Users that are interested in dumbo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python MapReduce library written in Cython. Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.☆242Jan 8, 2016Updated 10 years ago
- Run MapReduce jobs on Hadoop or Amazon Web Services☆2,616Apr 2, 2026Updated 2 weeks ago
- Utilities to use Avro files from Hadoop Map/Reduce jobs and Streaming☆26Sep 10, 2013Updated 12 years ago
- Example code for "Web-Scale Computer Vision using MapReduce for Multimedia Data Mining"☆48Aug 2, 2010Updated 15 years ago
- Parallel Algorithms in Python for Hadoop/Mapreduce☆55Aug 10, 2012Updated 13 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Fast binary [de]serialization of native python types☆33Jun 15, 2010Updated 15 years ago
- WE HAVE MOVED to Apache Incubator. https://cwiki.apache.org/FLUME/ . Flume is a distributed, reliable, and available service for effici…☆944May 26, 2021Updated 4 years ago
- John Langford's original release of Vowpal Wabbit -- a fast online learning algorithm☆57Aug 1, 2024Updated last year
- A Python MapReduce and HDFS API for Hadoop☆242Jan 19, 2026Updated 3 months ago
- Fork of flaxcode htmltotext module☆13Jul 30, 2011Updated 14 years ago
- Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more☆8,787Aug 16, 2017Updated 8 years ago
- Python Thrift driver for Apache Cassandra☆500May 29, 2019Updated 6 years ago
- Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.☆1,134Apr 10, 2023Updated 3 years ago
- url shortener using bottle, redis and gevent☆79Jun 30, 2012Updated 13 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Mirror of Apache MRUnit☆39Dec 10, 2018Updated 7 years ago
- Zohmg is a data store for aggregation of multi-dimensional time series data, built on top of Hadoop, Dumbo and HBase.☆173Oct 16, 2012Updated 13 years ago
- A small collection of useful utilities for the Tornado Webserver☆38Jan 8, 2013Updated 13 years ago
- A light-weight queue server in python tornado, it uses memcache protocol and store queues persistently.☆46Jun 22, 2017Updated 8 years ago
- Python Client for WebHDFS REST API☆43May 8, 2015Updated 10 years ago
- a pastebin clone written in python, using bottle and mongodb☆19Jun 3, 2010Updated 15 years ago
- Tornado Hub for Eventlet☆38Mar 5, 2013Updated 13 years ago
- An asynchronous client for Amazon SES☆40Oct 16, 2012Updated 13 years ago
- Asynchronous Redis client that works within Tornado IO loop.☆77May 20, 2011Updated 14 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Distributed database specialized in exporting key/value data from Hadoop☆559Jun 27, 2014Updated 11 years ago
- Lightning-fast cluster computing in Java, Scala and Python.☆1,421Apr 8, 2014Updated 12 years ago
- A small bit of code to make the Boto library for Amazon's AWS services work in an asynchronous (and extremely hacky) manner with Tornado.…☆28Jun 20, 2011Updated 14 years ago
- Redis Sharding is a multiplexed proxy-server, designed to work with the database divided to several servers. It's a temporary substitutio…☆110Dec 1, 2016Updated 9 years ago
- Deliverance stitches together HTTP responses to theme your content☆22Apr 8, 2013Updated 13 years ago
- Scribe is a server for aggregating log data streamed in real time from a large number of servers.☆3,910Aug 27, 2020Updated 5 years ago
- A Python web crawler using Tornado and ZeroMQ☆138May 9, 2012Updated 13 years ago
- example code for "Large-scale social media analysis with Hadoop" tutorial presented at ICWSM 2010☆42Jul 16, 2010Updated 15 years ago
- non-blocking HTTP. All further development will be at mnot/thor.☆233Sep 30, 2014Updated 11 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A starting point for developers who want to build applications using the Hunch API☆18Jan 21, 2011Updated 15 years ago
- Chrome Remote Shell library for Python (including evaluations)☆15Feb 16, 2011Updated 15 years ago
- Oozie - workflow engine for Hadoop☆375Jun 8, 2017Updated 8 years ago
- Crab is a flexible, fast recommender engine for Python that integrates classic information filtering recommendation algorithms in the world…☆1,175Dec 30, 2020Updated 5 years ago
- Bluemix sample app written in Python that uses the Klout and Twitter API's to analyze the influence of individual twitter usernames☆18Jan 16, 2017Updated 9 years ago
- Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. The primary goal of Bigtop is to build a …☆50Jul 4, 2011Updated 14 years ago
- HiiDef web spider framework, powers http://flavors.me☆18Apr 20, 2012Updated 13 years ago