booz-allen-hamilton / culvertLinks

Secondary indexing for structured and unstructured data in Big Table style databases.

☆44

Alternatives and similar repositories for culvert

Users that are interested in culvert are comparing it to the libraries listed below

Sorting:

LinkedInAttic / datafu
Hadoop library for large-scale data processing, now an Apache Incubator project
☆582Updated 11 years ago
etsy / Sahale
A Cascading Workflow Visualizer
☆83Updated 2 years ago
cwensel / cascading
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.
☆352Updated 8 months ago
collectivemedia / spark-hyperloglog
Interactive Audience Analytics with Spark and HyperLogLog
☆55Updated 10 years ago
deanwampler / scala-hadoop
Using Hadoop with Scala
☆70Updated 12 years ago
scalanlp / chalk
Chalk is a natural language processing library.
☆260Updated 8 years ago
Cascading / cascading
All development now happens over here: https://github.com/cwensel/cascading. Cascading is a feature rich API for defining and executing c…
☆332Updated 7 years ago
mesos / spark
Lightning-fast cluster computing in Java, Scala and Python.
☆1,427Updated 11 years ago
nathanmarz / storm-deploy
One click deploy for Storm clusters on AWS
☆516Updated 10 years ago
velvia / ScalaStorm
Harness the power and elegance of Scala with nathanmarz's Storm real-time system
☆249Updated 9 years ago
spotify / hdfs2cass
Hadoop mapreduce job to bulk load data into Cassandra
☆75Updated 3 years ago
intentmedia / mario
Functional, Typesafe, Declarative Data Pipelines
☆139Updated 7 years ago
hmsonline / storm-cassandra
Storm Cassandra Integration
☆181Updated 2 years ago
sonalgoyal / crux
Crux is a reporting application for HBase. Crux provides a simple web based graphical interface to access HBase, query data and create re…
☆100Updated 12 years ago
adobe-research / spindle
Next-generation web analytics processing with Scala, Spark, and Parquet.
☆331Updated 10 years ago
YahooArchive / oozie
Oozie - workflow engine for Hadoop
☆374Updated 8 years ago
sonalgoyal / hiho
Hadoop Data Integration with various databases, ftp servers, salesforce. Incremental update, dedup, append, merge your data on Hadoop.
☆90Updated 12 years ago
tresata / spark-sorted
Secondary sort and streaming reduce for Apache Spark
☆78Updated 2 years ago
twitter / elephant-bird
Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.
☆1,133Updated 2 years ago
SinghAsDev / pankh
☆76Updated 10 years ago
dlyubimov / HBase-Lattice
HBase-based BI "OLAP-ish" solution
☆59Updated 12 years ago
ddf-project / DDF
Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data Engine
☆166Updated 4 years ago
massie / spark-parquet-example
Example project to show how to use Spark to read and write Avro/Parquet files
☆50Updated 12 years ago
acrosa / scala-redis
A scala library for connecting to a redis server, or a cluster of redis nodes using consistent hashing on the client side.
☆146Updated 9 years ago
medale / spark-mail
Tutorial on parsing Enron email to Avro and then explore the email set using Spark.
☆52Updated last year
tresata / spark-scalding
Use Cascading Taps and Scalding DSL with Spark
☆49Updated 9 years ago
alexanderdean / Unified-Log-Processing
Supporting material (code, schemas etc) for Unified Log Processing (Manning Publications)
☆98Updated 3 years ago
zinniasystems / Nectar
Open source framework for predictive modeling on Apache Hadoop
☆34Updated 11 years ago
scalding-io / ProgrammingWithScalding
Programming MapReduce with Scalding
☆82Updated 10 years ago
VeritoneAlpha / jaws-spark-sql-rest
☆92Updated 8 years ago