ExpediaGroup / insights-explorer
Insights Explorer is a tool to catalogue and present analytical & research work.
☆13Updated 4 months ago
Alternatives and similar repositories for insights-explorer:
Users that are interested in insights-explorer are comparing it to the libraries listed below
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated last year
- A service which allows Hive Metastore Listeners to be deployed outside of the Hive Metastore Service☆12Updated 5 months ago
- Service for automatically managing and cleaning up unreferenced data☆46Updated 2 weeks ago
- Scala SDK for working with Snowplow enriched events in Spark, AWS Lambda, Flink et al.☆21Updated 5 months ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated last year
- Amundsen Gremlin☆21Updated 2 years ago
- A component which takes nifi flow xml file as input and converts it into terraform script for creating/updating a flow on nifi☆28Updated 3 years ago
- A library for strong, schema based conversion between 'natural' JSON documents and Avro☆18Updated last year
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…☆71Updated 2 years ago
- Terraform scripts for deploying Apiary Data Lake☆19Updated last week
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆50Updated last year
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- Extensions available for use in Apiary☆11Updated this week
- ETLy is an add-on dashboard service on top of Apache Airflow.☆69Updated last year
- Automatically loads new partitions in AWS Athena☆18Updated 4 years ago
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Updated 9 years ago
- Demonstration of a Hive Input Format for Iceberg☆26Updated 4 years ago
- Mutation testing framework and code coverage for Hive SQL☆24Updated 3 years ago
- An implementation of the DatasourceV2 interface of Apache Spark™ for writing Spark Datasets to Apache Druid™.☆41Updated 6 months ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆94Updated this week
- Herd-UI is a search and discovery tool for business and technical users. Everyone in your organization can use Herd-UI to browse and unde…☆16Updated 2 years ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated 2 months ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- Dione - a Spark and HDFS indexing library☆52Updated last year
- Kafka Connect Vespa sink connector☆14Updated last week
- Dashboard for operating Flink jobs and deployments.☆33Updated 5 months ago
- ⚠️ MAINTENANCE-ONLY MODE: Snowplow maintained SQL data models for working with Snowplow web and mobile behavioral data.☆41Updated 3 months ago
- ☆21Updated 2 years ago
- This code is used to build & run a Docker container for performing predictions against a Spark ML Pipeline.☆53Updated last year