Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabytes of data and make it accessible for data processing and analytical purposes by any cloud compute platform.
☆138Oct 1, 2022Updated 3 years ago
Alternatives and similar repositories for herd
Users that are interested in herd are comparing it to the libraries listed below
Sorting:
- Herd-UI is a search and discovery tool for business and technical users. Everyone in your organization can use Herd-UI to browse and unde…☆16Oct 1, 2022Updated 3 years ago
- Herd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.☆15Jul 17, 2024Updated last year
- Apache Zeppelin Service for Apache Ambari Service. Installation and management of Zeppelin via Ambari.☆14Jan 23, 2016Updated 10 years ago
- Hortonworks Data Platform Data Generation Tool☆13Nov 30, 2017Updated 8 years ago
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆18Jun 28, 2021Updated 4 years ago
- Scala API for Apache Spark SQL high-order functions☆14Aug 4, 2023Updated 2 years ago
- Gatekeeper is a self-serviced web application allowing users to make requests for temporary access to EC2 & RDS instances running in AWS …☆28Dec 16, 2023Updated 2 years ago
- Hadoop YARN & MapReduce Memory Calculator☆13Nov 9, 2015Updated 10 years ago
- Ambari Service definition for deploying R & RHadoop libraries☆18Aug 3, 2015Updated 10 years ago
- An AWS SDK-backed FileSystem driver for Hadoop☆64Oct 13, 2020Updated 5 years ago
- HashCats Auto Clicker is a versatile tool that enhances your gaming experience by automating various actions within the HashCats game☆18Updated this week
- Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream …☆22Feb 6, 2017Updated 9 years ago
- Library and a Framework for building fast, scalable, fault-tolerant Data APIs based on Akka, Avro, ZooKeeper and Kafka☆25Oct 16, 2020Updated 5 years ago
- Marquez Web UI☆21Nov 13, 2020Updated 5 years ago
- Quickly deploy Hadoop with the help of Ansible and Apache Ambari☆38Jul 15, 2015Updated 10 years ago
- Build configuration-driven ETL pipelines on Apache Spark☆161Oct 4, 2022Updated 3 years ago
- docker image to deploy rabbitmq cluster on mesos with one marathon app☆10Oct 12, 2017Updated 8 years ago
- Dockerfile and artifacts for running a self-contained HDP 2.3 "cluster" in a docker container☆10Aug 30, 2016Updated 9 years ago
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆10Oct 11, 2019Updated 6 years ago
- kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)☆95Apr 4, 2019Updated 6 years ago
- Building custom data sources for Apache Spark, in Java.☆12Oct 12, 2020Updated 5 years ago
- Elastic Discovery - Help applications that don't quite work in the cloud better handle autoscaling and other cloud events.☆10Oct 26, 2015Updated 10 years ago
- Giter8 template for a simple project that uses sbt-crossproject.☆11Jul 16, 2018Updated 7 years ago
- Reusable code for Hive☆16Aug 19, 2014Updated 11 years ago
- Spawns JupyterHub single user servers in Marathon☆11Oct 8, 2017Updated 8 years ago
- HDFS Automatic Snapshot Service for Linux☆11Oct 17, 2016Updated 9 years ago
- Workshop for Hadoop Operations Best Practices☆10Feb 24, 2015Updated 11 years ago
- Implementation of a Big Data (batch and stream) distributed processing engine in Java using Akka actors.☆12Feb 20, 2023Updated 3 years ago
- Konzepte von Core-Java 8 werden durch beispiele illustriert. Java 8's core concepts are explained by examples.☆12Oct 12, 2018Updated 7 years ago
- Open source task scheduler with dependency management☆15Jul 1, 2018Updated 7 years ago
- Prescriptive Applications over Kite and Hadoop☆12Oct 14, 2015Updated 10 years ago
- Content Data Store (HDFS/HBase)☆13Dec 1, 2016Updated 9 years ago
- Data sets and Vagrant script to provision a virtual machine for Apache Calcite development☆30Mar 24, 2023Updated 2 years ago
- Convert a CSV fle to ORCFile☆26Apr 10, 2019Updated 6 years ago
- Embed any webapp/website as Ambari view!☆25Feb 26, 2016Updated 10 years ago
- Apache DataLab (incubating)☆152Oct 3, 2023Updated 2 years ago
- SamzaSQL: Streaming SQL implementation on top of Apache Samza and Apache Kafka☆29Jun 8, 2016Updated 9 years ago
- REDstack - Hadoop as a service on OpenStack☆15Oct 8, 2018Updated 7 years ago
- Showing the relationship between ImageNet ID and labels and pytorch pre-trained model output ID and labels☆10Oct 11, 2020Updated 5 years ago