Breaka84 / SpooqLinks
☆9Updated 9 months ago
Alternatives and similar repositories for Spooq
Users that are interested in Spooq are comparing it to the libraries listed below
Sorting:
- Presentation and notebook sources for Scala IO and Scale by the Bay 2018 Spark and Frameless talk☆11Updated 6 years ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Connect DBVisualizer to Hortonwork HiveServer2☆9Updated 10 years ago
- Software tool to manage your notes, scripts, code examples, configs,... to publish them as gists or snippets☆39Updated last month
- Data quality control tool built on spark and deequ☆24Updated 2 months ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated last year
- Some Avro operations in Scala☆10Updated 6 months ago
- Extensible streaming ingestion pipeline on top of Apache Spark☆44Updated last week
- A curated list of awesome Databricks resources, including Spark☆19Updated 11 months ago
- Scala API for Apache Spark SQL high-order functions☆14Updated last year
- Sample processing code using Spark 2.1+ and Scala☆52Updated 4 years ago
- Dataset for training ML ranking models☆20Updated 2 years ago
- Code snippets used in demos recorded for the blog.☆37Updated last month
- Dione - a Spark and HDFS indexing library☆52Updated last year
- A sbt plugin for creating NiFi Archive bundles to support the classloader isolation model of NiFi.☆10Updated 2 years ago
- Standalone alternatives to Kafka Connect Connectors☆45Updated last week
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- Receipes of publicly-available Jupyter images☆8Updated 2 months ago
- Lambdas covering supporter operations, mostly in life operations☆11Updated this week
- Skeleton project for Apache Airflow training participants to work on.☆16Updated 4 years ago
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆36Updated 5 months ago
- ☆13Updated last week
- NiFi Processor for Apache Pulsar☆10Updated 7 months ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆61Updated 9 months ago
- ☆10Updated 3 years ago
- A Kafka mirroring service based on Akka Streams Kafka☆10Updated 3 years ago
- Spark SQL Macros provides a mechanism similar to Spark User-Defined function registration; with the key enhancement being that custom cod…☆16Updated 4 years ago
- An Ansible collection for lifecycle and management of Cloudera CDP Private Cloud resources on bare metal, IaaS, and PaaS.☆34Updated 2 weeks ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆59Updated last year
- Example project using DBT, Databricks and AdventureWorks sample database☆12Updated 2 years ago