A framework for systematically quality controlling big data.
☆40Mar 13, 2023Updated 3 years ago
Alternatives and similar repositories for TopNotch
Users that are interested in TopNotch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Simple Spark example of generating table stats for use of data quality checks☆28Apr 28, 2017Updated 8 years ago
- A tool for running Spark on Google Compute Engine☆16Jan 20, 2017Updated 9 years ago
- Scala library for converting Spark rows to case classes☆11Mar 14, 2017Updated 9 years ago
- ☆14Apr 8, 2017Updated 9 years ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆29Nov 4, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Library to run in process Kafka broker☆16Nov 20, 2018Updated 7 years ago
- The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.☆10Jun 1, 2015Updated 10 years ago
- A collection of Apache Parquet add-on modules☆30Mar 30, 2026Updated last week
- An columnar serializer☆15Feb 26, 2016Updated 10 years ago
- Watching the FISA Court's public docket.☆43Dec 19, 2014Updated 11 years ago
- ☆12Nov 6, 2014Updated 11 years ago
- i2dash: interactive and iterative dashboards.☆10Sep 5, 2023Updated 2 years ago
- native Rust implementation of Kafka protocol and api☆14Jun 13, 2023Updated 2 years ago
- Demonstration of VPC Peering and VPN connections in AWS☆14Aug 8, 2017Updated 8 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Verify that a local file is identical to an object on Amazon S3, without having to download the object.☆12Sep 6, 2024Updated last year
- Rust tools for working with CSV files: scrubcsv, catcsv, fixed2csv, geochunk, hashcsv.☆20Jan 17, 2026Updated 2 months ago
- Optimus is a mathematical programming library for Scala.☆150Mar 16, 2026Updated 3 weeks ago
- EJBCA PKI Engine and Backend for HashiCorp Vault. Used to issue, sign, and revoke certificates using the EJBCA CA.☆11Dec 18, 2025Updated 3 months ago
- A GameBoy Emulator written in Rust, written as a learning project for both☆10Jun 6, 2023Updated 2 years ago
- ☆92Nov 15, 2015Updated 10 years ago
- CentCom is a suite of software used for implementing a data warehouse of bans for Space Station 13 from a variety of public sources.☆12Mar 8, 2026Updated last month
- Cis Recommender☆16May 1, 2012Updated 13 years ago
- Scripts used to setup a Spark cluster on EC2☆21Mar 24, 2016Updated 10 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- A versioned database inspired by Git☆16Dec 16, 2017Updated 8 years ago
- ☆30Aug 8, 2015Updated 10 years ago
- A research and review of techniques to provide a natural language interface to RDMS.☆10Dec 8, 2017Updated 8 years ago
- ☆18May 4, 2023Updated 2 years ago
- Time series analysis with Apache Spark based on Chronix |☆38Mar 15, 2017Updated 9 years ago
- Frontend FIDS service for FlightAware sample apps☆13Jan 6, 2023Updated 3 years ago
- Amazon Web Services Bundle Package☆15Jan 12, 2020Updated 6 years ago
- ☆38Jun 1, 2021Updated 4 years ago
- Pyspark Notebook With Docker☆11Aug 18, 2015Updated 10 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Automation of JupyterHub operations and testing☆14Aug 25, 2022Updated 3 years ago
- Code for the Adzuna Salary Prediction Kaggle competition - http://www.kaggle.com/c/job-salary-prediction Placed 10th out of approximately…☆12Apr 10, 2013Updated 12 years ago
- Data pipeline automation tool☆28Jan 11, 2024Updated 2 years ago
- WSLKit is a generic toolkit for Windows Subsystem for Linux (WSL), with a PowerShell API, and support for VPN-friendly networking kit (VP…☆21Updated this week
- A Spark datasource for the HadoopCryptoLedger library☆13Sep 29, 2025Updated 6 months ago
- ☆10Apr 6, 2023Updated 3 years ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆475Apr 18, 2017Updated 8 years ago