apache / crunch
Mirror of Apache Crunch (Incubating)
☆104Updated 3 years ago
Alternatives and similar repositories for crunch:
Users that are interested in crunch are comparing it to the libraries listed below
- Apache Tephra: Transactions for HBase.☆157Updated 4 months ago
- Mirror of Apache HCatalog☆60Updated last year
- Mirror of Apache Atlas (Incubating)☆95Updated last year
- Mirror of Apache Slider☆78Updated 6 years ago
- Fast JVM collection☆59Updated 9 years ago
- Mirror of Apache Falcon☆103Updated 5 years ago
- Fast and efficient batch computation engine for complex analysis and reporting of massive datasets on Hadoop☆244Updated 9 years ago
- Cascading on Apache Flink®☆54Updated 11 months ago
- Mirror of Apache Lens☆60Updated 5 years ago
- Mirror of Apache Twill☆69Updated 4 years ago
- Mirror of Apache Apex malhar☆132Updated 5 years ago
- The Apache Gora open source framework provides an in-memory data model and persistence for big data.☆120Updated 11 months ago
- hRaven collects run time data and statistics from MapReduce jobs in an easily queryable format☆126Updated 3 years ago
- All development now happens over here: https://github.com/cwensel/cascading. Cascading is a feature rich API for defining and executing c…☆331Updated 6 years ago
- XPath likeness for Avro☆35Updated last year
- The SpliceSQL Engine☆167Updated last year
- Hadoop log aggregator and dashboard☆191Updated 11 years ago
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 10 years ago
- Multidimensional data storage with rollups for numerical data☆266Updated last year
- A simple storm performance/stress test☆74Updated last year
- Enabling Spark Optimization through Cross-stack Monitoring and Visualization☆47Updated 7 years ago
- Mirror of Apache Sentry☆34Updated 5 years ago
- SamzaSQL: Streaming SQL implementation on top of Apache Samza and Apache Kafka☆29Updated 8 years ago
- Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.☆348Updated 7 months ago
- Obsolete - superseded by Apache Calcite☆235Updated 4 years ago
- Google Dataflow Runner for Apache Flink™ (deprecated; please use the up-to-date Beam Runner)☆88Updated 8 years ago
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆96Updated 5 years ago
- Mirror of Apache Hama☆131Updated 4 years ago
- A library to expose more of Apache Spark's metrics system☆146Updated 5 years ago
- Mirror of Apache Spark☆57Updated 9 years ago