FINRAOS / herdLinks
Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabytes of data and make it accessible for data processing and analytical purposes by any cloud compute platform.
☆138Updated 3 years ago
Alternatives and similar repositories for herd
Users that are interested in herd are comparing it to the libraries listed below
Sorting:
- Autoscaling EMR clusters and Kinesis streams on Amazon Web Services (AWS)☆47Updated last year
- kinesis-kafka-connector is connector based on Kafka Connect to publish messages to Amazon Kinesis streams or Amazon Kinesis Firehose.☆158Updated 2 years ago
- Apache Spark AWS Lambda Executor (SAMBA)☆44Updated 7 years ago
- Apache Spark on AWS Lambda☆156Updated 3 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆91Updated last year
- Implementations of open source Apache Hadoop/Hive interfaces which allow for ingesting data from Amazon DynamoDB☆228Updated 7 months ago
- Bender - Serverless ETL Framework☆188Updated last year
- Reference architecture for real-time stream processing with Apache Flink on Amazon EMR, Amazon Kinesis, and Amazon Elasticsearch Service.☆70Updated last year
- Amazon Elastic MapReduce code samples☆63Updated 10 years ago
- DataPipeline for humans.☆250Updated 3 years ago
- Ferry lets you define, run, and deploy big data applications on AWS, OpenStack, and your local machine using Docker☆254Updated 10 years ago
- DynamoDB data source for Apache Spark☆95Updated 4 years ago
- A visual ETL development and debugging tool for big data☆154Updated 3 years ago
- ☆327Updated 4 years ago
- Tool to generate a Hive schema from a JSON example doc☆227Updated 6 years ago
- Demonstrates NiFi template deployment and configuration via a REST API☆70Updated 8 years ago
- Amazon Kinesis Aggregators provides a simple way to create real time aggregations of data on Amazon Kinesis.☆151Updated 4 years ago
- kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)☆95Updated 6 years ago
- Redshift Ops Console☆92Updated 10 years ago
- Kinesis spout for Storm☆107Updated 7 years ago
- This repository is to help with the Partner Demonstration of the Apache Atlas project.☆30Updated 10 years ago
- A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR☆119Updated 9 years ago
- An open-source, vendor-neutral data context service.☆160Updated 7 years ago
- AWS bootstrap scripts for Mozilla's flavoured Spark setup.☆47Updated 5 years ago
- ☆76Updated 10 years ago
- Cloudera Director sample code☆61Updated 6 years ago
- Cloudformation templates for deploying Airflow in ECS☆40Updated 7 years ago
- Simplify getting Zeppelin up and running☆56Updated 9 years ago
- The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog a…☆226Updated 8 months ago
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆77Updated 7 years ago