A Cascading Workflow Visualizer
☆83May 9, 2023Updated 2 years ago
Alternatives and similar repositories for Sahale
Users that are interested in Sahale are comparing it to the libraries listed below
Sorting:
- Integrate the GA4GH schemas and probably a scala impl of the service.☆14May 20, 2016Updated 9 years ago
- Set up tools for running a few DL libraries on CDH and CDSW☆17Jul 23, 2020Updated 5 years ago
- Structured output benchmarks comparing DSPy and BAML with different LLMs☆27Dec 23, 2025Updated 2 months ago
- Document classification with Apache Spark on an American Classic☆10Sep 25, 2015Updated 10 years ago
- A utility for generating Oozie workflows from a YAML definition☆49Mar 4, 2019Updated 7 years ago
- Cassandra Dataset Manager☆14Sep 1, 2017Updated 8 years ago
- Scala stuff☆18Jun 13, 2019Updated 6 years ago
- Refactor utilities for Java code☆13Nov 2, 2017Updated 8 years ago
- Cascading on Apache Flink®☆54Feb 5, 2024Updated 2 years ago
- Generate scrape files for Prometheus from PuppetDB☆16Jul 18, 2025Updated 7 months ago
- Utility for benchmarking changes in Spark using TPC-DS workloads☆16Jun 3, 2021Updated 4 years ago
- Programming MapReduce with Scalding☆82Dec 5, 2015Updated 10 years ago
- The Scalding tutorial as a standalone SBT project☆51Oct 16, 2017Updated 8 years ago
- Mockito for Kotlin☆14Feb 16, 2019Updated 7 years ago
- Library for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.☆48Updated this week
- An AWS SDK-backed FileSystem driver for Hadoop☆64Oct 13, 2020Updated 5 years ago
- Relay.js-compatible GraphQL adaptor for ES4J☆24Nov 25, 2016Updated 9 years ago
- ☆22Jun 10, 2018Updated 7 years ago
- Pinball is a scalable workflow manager☆1,043Dec 10, 2019Updated 6 years ago
- This is the example code repository for Getting Started with Impala by John Russell (O'Reilly Media)☆22Aug 20, 2017Updated 8 years ago
- Tutorial on parsing Enron email to Avro and then explore the email set using Spark.☆52Jul 11, 2024Updated last year
- PonySDK is an open source project and application that uses open source tools built on the Java platform to help you develop Web applicat…☆23Updated this week
- Timberlake is a Job Tracker for Hadoop.☆177Jan 24, 2020Updated 6 years ago
- Hive + Avro. Serde for working with Avro in Hive☆59Dec 16, 2023Updated 2 years ago
- example of using RDFlib to take a CSV and make triples from it☆26Apr 12, 2018Updated 7 years ago
- A sane date/time python interface #hubspot-open-source☆58Feb 6, 2019Updated 7 years ago
- A Finagle based image processor.☆39Jan 13, 2016Updated 10 years ago
- ⛅ Run OpenVSCode Server in Google Cloud Shell☆11Dec 22, 2023Updated 2 years ago
- Offline Hadoop Elasticsearch Index Building and Tools For Lambda Architectures☆31Feb 13, 2024Updated 2 years ago
- Interactive notebooks for trying analyses and exploring datasets☆32Aug 10, 2015Updated 10 years ago
- Enabling queries on compressed data.☆282Dec 16, 2023Updated 2 years ago
- Supporting material (code, schemas etc) for Unified Log Processing (Manning Publications)☆98Jul 22, 2022Updated 3 years ago
- Opera Logo☆28Jun 25, 2019Updated 6 years ago
- Python wrappers for the FirecREST API☆12Dec 23, 2025Updated 2 months ago
- Use Solr clients/tools with ElasticSearch☆77Feb 25, 2013Updated 13 years ago
- personalized collection of books☆11Jan 24, 2021Updated 5 years ago
- A collection of Apache Parquet add-on modules☆30Updated this week
- Architecture Wars: MVC strikes back☆11Mar 18, 2018Updated 7 years ago
- An API that wraps around the Tor control port to create ad-hoc hidden services☆10Sep 22, 2019Updated 6 years ago