Python module that allows one to easily write and run Hadoop programs.
☆1,031Jan 9, 2018Updated 8 years ago
Alternatives and similar repositories for dumbo
Users that are interested in dumbo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python MapReduce library written in Cython. Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.☆242Jan 8, 2016Updated 10 years ago
- Run MapReduce jobs on Hadoop or Amazon Web Services☆2,615Mar 24, 2023Updated 3 years ago
- Utilities to use Avro files from Hadoop Map/Reduce jobs and Streaming☆26Sep 10, 2013Updated 12 years ago
- Twitter on Tornado☆47Sep 28, 2009Updated 16 years ago
- Fast binary [de]serialization of native python types☆33Jun 15, 2010Updated 15 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- WE HAVE MOVED to Apache Incubator. https://cwiki.apache.org/FLUME/ . Flume is a distributed, reliable, and available service for effici…☆944May 26, 2021Updated 4 years ago
- John Langford's original release of Vowpal Wabbit -- a fast online learning algorithm☆57Aug 1, 2024Updated last year
- Refactored version of code.google.com/hadoop-gpl-compression for hadoop 0.20☆551Apr 24, 2024Updated last year
- A Python MapReduce and HDFS API for Hadoop☆241Jan 19, 2026Updated 2 months ago
- RHadoop☆762Nov 24, 2015Updated 10 years ago
- Ruby on Hadoop: Efficient, effective Hadoop streaming & bulk data processing. Write micro scripts for terabyte-scale data☆494Jun 19, 2014Updated 11 years ago
- This Project aims to implement an unofficial Android client for the service at https://getamen.com☆15Aug 11, 2012Updated 13 years ago
- Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more☆8,789Aug 16, 2017Updated 8 years ago
- Fork of flaxcode htmltotext module☆13Jul 30, 2011Updated 14 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Yahoo!'s topic modelling framework using Latent Dirichlet Allocation☆337Sep 21, 2011Updated 14 years ago
- Python Thrift driver for Apache Cassandra☆500May 29, 2019Updated 6 years ago
- Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.☆1,132Apr 10, 2023Updated 2 years ago
- url shortener using bottle, redis and gevent☆80Jun 30, 2012Updated 13 years ago
- Mirror of Apache MRUnit☆38Dec 10, 2018Updated 7 years ago
- Zohmg is a data store for aggregation of multi-dimensional time series data, built on top of Hadoop, Dumbo and HBase.☆173Oct 16, 2012Updated 13 years ago
- A small collection of useful utilities for the Tornado Webserver☆38Jan 8, 2013Updated 13 years ago
- A light-weight queue server in python tornado, it uses memcache protocol and store queues persistently.☆46Jun 22, 2017Updated 8 years ago
- Collection of functions 4 R and CouchDB interaction☆31Mar 5, 2017Updated 9 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Tornado Hub for Eventlet☆38Mar 5, 2013Updated 13 years ago
- Scribe logging module for nginx☆27Apr 7, 2011Updated 14 years ago
- An asynchronous client for Amazon SES☆41Oct 16, 2012Updated 13 years ago
- Asynchronous Redis client that works within Tornado IO loop.☆77May 20, 2011Updated 14 years ago
- Distributed database specialized in exporting key/value data from Hadoop☆558Jun 27, 2014Updated 11 years ago
- Lightning-fast cluster computing in Java, Scala and Python.☆1,425Apr 8, 2014Updated 11 years ago
- A small bit of code to make the Boto library for Amazon's AWS services work in an asynchronous (and extremely hacky) manner with Tornado.…☆28Jun 20, 2011Updated 14 years ago
- Redis Sharding is a multiplexed proxy-server, designed to work with the database divided to several servers. It's a temporary substitutio…☆110Dec 1, 2016Updated 9 years ago
- Deliverance stitches together HTTP responses to theme your content☆22Apr 8, 2013Updated 12 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Scribe is a server for aggregating log data streamed in real time from a large number of servers.☆3,914Aug 27, 2020Updated 5 years ago
- A Python web crawler using Tornado and ZeroMQ☆139May 9, 2012Updated 13 years ago
- non-blocking HTTP. All further development will be at mnot/thor.☆233Sep 30, 2014Updated 11 years ago
- Chrome Remote Shell library for Python (including evaluations)☆15Feb 16, 2011Updated 15 years ago
- Oozie - workflow engine for Hadoop☆374Jun 8, 2017Updated 8 years ago
- Crab is a flexible, fast recommender engine for Python that integrates classic information filtering recommendation algorithms in the world…☆1,176Dec 30, 2020Updated 5 years ago
- Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. The primary goal of Bigtop is to build a …☆50Jul 4, 2011Updated 14 years ago