Parquet / parquet-cpp
☆20Updated this week
Related projects: ⓘ
- The Musketeer workflow manager.☆41Updated 5 years ago
- Compilation and rule-based optimization framework for relational algebra. Raco is the language, optimization, and query translation layer…☆72Updated 6 years ago
- ☆12Updated this week
- Myria is a scalable Analytics-as-a-Service platform based on relational algebra.☆112Updated 2 years ago
- libhdfs++ is a modern implementation of HDFS client in C++11 that is designed for the Massive Parallel Processing (MPP) applications.☆27Updated 9 years ago
- Quickstep Project☆27Updated 5 years ago
- Cascading on Apache Flink®☆54Updated 7 months ago
- Albis: High-Performance File Format for Big Data Systems☆21Updated 6 years ago
- Streaming estimation of percentiles, especially high percentiles.☆63Updated 11 years ago
- A composable framework for fast and scalable data analytics☆57Updated last year
- ☆154Updated this week
- Ductile DB is a graph database based on Hadoop/HBase which provides a vast set of features.☆13Updated 6 years ago
- Scriptable scheduler for periodical Hadoop workflows☆22Updated 6 years ago
- Secondary index on HBase☆18Updated 8 years ago
- Apache Quickstep Incubator - This project is retired☆94Updated 5 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆53Updated 6 years ago
- S-Store Transactional Streaming Data Management System☆22Updated 4 years ago
- GlusterFS plugin for Hadoop HCFS☆69Updated 2 years ago
- Benchmarks of artificial neural network library for Spark MLlib☆11Updated 8 years ago
- ☆27Updated 3 weeks ago
- Fast I/O plugins for Spark☆41Updated 3 years ago
- ☆18Updated this week
- ByteBuffer collection classes for java and jvm-based languages.☆33Updated 6 years ago
- ☆16Updated 3 years ago
- Hadoop Profiler, or hprofiler, is a tool which is able to analyze on- and off-CPU workloads on distributed computing environments.☆24Updated 8 years ago
- Moments Sketch Code☆40Updated 5 years ago
- ☆28Updated 7 years ago
- Dynamic Distributed Dimensional Data Model☆40Updated 4 months ago
- CUDA kernel and JNI code which is called by Apache Spark's MLlib.☆19Updated 8 years ago
- Graphulo: Accumulo library of matrix math primitives and graph algorithms☆78Updated 4 months ago