Simple Spark example of generating table stats for use of data quality checks
☆28Apr 28, 2017Updated 8 years ago
Alternatives and similar repositories for Spark.TableStatsExample
Users that are interested in Spark.TableStatsExample are comparing it to the libraries listed below
Sorting:
- NRT Sessionization with Spark Streaming landing on HDFS and putting live stats in HBase☆50Oct 31, 2014Updated 11 years ago
- The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.☆10Jun 1, 2015Updated 10 years ago
- Track app memory usage.☆11Jan 13, 2015Updated 11 years ago
- Repository for the Scala Beyond the Basics Training with O'Reilly Publishing☆13Oct 22, 2018Updated 7 years ago
- An Apache Spark app for making data movement between Apache Hive and Apache Phoenix/HBase☆14Mar 23, 2016Updated 9 years ago
- An columnar serializer☆15Feb 26, 2016Updated 10 years ago
- Examples for Apache Oozie book☆18May 30, 2016Updated 9 years ago
- sample oozie workflows☆17Jun 13, 2017Updated 8 years ago
- ☆22Jun 10, 2018Updated 7 years ago
- Based off the design of SparkOnHBase. This Repo will support Spark, Spark Streaming, and Spark SQL integration with Kudu.☆50May 19, 2016Updated 9 years ago
- Coding interview questions with solutions and tests (Scala)☆26Sep 23, 2025Updated 5 months ago
- Muppet☆128May 7, 2021Updated 4 years ago
- A collection of my quantopian learnings, trying to use everything locally and with updated code which works in 2020.☆35May 16, 2021Updated 4 years ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆29Nov 4, 2024Updated last year
- Integrate Grafana with Ambari Metrics System☆27Jun 13, 2025Updated 8 months ago
- ⛅ Run OpenVSCode Server in Google Cloud Shell☆11Dec 22, 2023Updated 2 years ago
- Visual + Stream , a live stream data visualization lib, follows the Grammar of Graphics☆33Updated this week
- Trading algorithm for Bitcoins in USD on quantconnect.com☆13Jan 12, 2018Updated 8 years ago
- A collection of Apache Parquet add-on modules☆30Feb 12, 2026Updated 2 weeks ago
- ☆33Jan 9, 2016Updated 10 years ago
- Learning Apache Kylin for beginner☆30Jun 7, 2018Updated 7 years ago
- A self-contained morphological analyzer (including dictionary data).☆33Jul 30, 2015Updated 10 years ago
- Accepted at WWW 25 Industrial Track (oral)☆18Jun 6, 2025Updated 8 months ago
- Factorization Machines for Julia☆11Aug 26, 2016Updated 9 years ago
- This code is a version of implement of the essay named Deep Inception Networks: A General End-to-End Framework for Multi-asset Quantitati…☆12Mar 15, 2024Updated last year
- Benchmarks of artificial neural network library for Spark MLlib☆11Dec 3, 2015Updated 10 years ago
- ☆13Updated this week
- Data models for Segment built using dbt (getdbt.com).☆11Jul 31, 2024Updated last year
- Examples of Selenium in Python☆11Jun 11, 2018Updated 7 years ago
- [AAAI'23] FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction https://arxiv.org/abs/2304.00902☆10Apr 9, 2023Updated 2 years ago
- Bridge between OTel and KEDA api☆12Feb 7, 2026Updated 3 weeks ago
- ☆10May 16, 2022Updated 3 years ago
- Proximal Asynchronous SAGA☆13Nov 30, 2017Updated 8 years ago
- GPO Bypass is a tool / proof-of-concept that highlights how one can bypass Group Policy enforced policies. It uses Firefox as an example.☆14Jan 28, 2023Updated 3 years ago
- Financial Machine Learning with R☆15Jan 26, 2020Updated 6 years ago
- This repository contains python code to create, backtest and automate intraday-trading algorithms in financial markets using Machine Lear…☆10Sep 30, 2021Updated 4 years ago
- JSAI2019でのチュートリアル講演 「オントロジー工学に基づくセマンティック技術」の資料公開用☆12Jun 7, 2019Updated 6 years ago
- ☆45Oct 5, 2025Updated 4 months ago
- Step definitions to test HTTP clients/servers with godog☆12Dec 2, 2025Updated 2 months ago