low-level helpers for Apache Spark libraries and tests
☆16Dec 29, 2018Updated 7 years ago
Alternatives and similar repositories for spark-util
Users that are interested in spark-util are comparing it to the libraries listed below
Sorting:
- Utilities for writing tests that use Apache Spark.☆24Dec 29, 2018Updated 7 years ago
- Miscellaneous functionality for manipulating Apache Spark RDDs.☆22Dec 29, 2018Updated 7 years ago
- Spark functions to run popular phonetic and string matching algorithms☆59Feb 22, 2022Updated 4 years ago
- ☆20Apr 27, 2012Updated 13 years ago
- Spark stream from kafka(json) to s3(parquet)☆15Nov 8, 2018Updated 7 years ago
- TileDB integrations for machine learning data and model i/o (PyTorch, TensorFlow, Scikit-Learn)☆25Dec 4, 2025Updated 2 months ago
- Multi-project build tool, based on sbt.☆84Jan 21, 2023Updated 3 years ago
- Spark data profiling utilities☆23Nov 24, 2018Updated 7 years ago
- A Spark metrics sink that pushes to InfluxDb☆51Jan 14, 2021Updated 5 years ago
- materials and resources for workshop curriculum☆20Oct 20, 2019Updated 6 years ago
- Scripts for parsing / making sense of yarn logs☆52Aug 22, 2016Updated 9 years ago
- An extension to the amazing Spark framework for better functional programming.☆28May 19, 2016Updated 9 years ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆29Nov 4, 2024Updated last year
- This is a http metrics reporter for kafka using Jetty with the Codahale metrics servlets (http://metrics.codahale.com/manual/servlets/kaf…☆37Jul 25, 2017Updated 8 years ago
- Presentations and other resources.☆37Jul 13, 2020Updated 5 years ago
- Ready-to-go Parquet-formatted public 'omics datasets☆30Nov 2, 2015Updated 10 years ago
- ☆13Nov 10, 2025Updated 3 months ago
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆36Dec 15, 2024Updated last year
- ISO 3166-1, ISO 3166-2, ISO 4217, E.164, ISO related types in Scala. Country codes, Country Subdivision, Country Currency, Calling Code, …☆35Mar 30, 2019Updated 6 years ago
- Schema Registry integration for Apache Spark☆40Nov 16, 2022Updated 3 years ago
- HBase tailored but otherwise generic JMXToolkit.☆28Jul 6, 2016Updated 9 years ago
- Repo to hold code Artifacts for WAF☆10Sep 14, 2022Updated 3 years ago
- ElasticSearch settings scheduler☆35Aug 6, 2016Updated 9 years ago
- MongoDB 3.6 Developer Workshop☆10Apr 27, 2018Updated 7 years ago
- Everything which has to do with Data Integration. Templates for Azure Data Factory and Azure Synapse Analytics☆10Jan 29, 2022Updated 4 years ago
- Apache Spark ETL Utilities☆39Oct 23, 2024Updated last year
- phData Pulse application log aggregation and monitoring☆13Apr 13, 2020Updated 5 years ago
- ☆45Feb 9, 2022Updated 4 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Dec 28, 2016Updated 9 years ago
- Hive Storage Handler for SOLR☆16Mar 17, 2014Updated 11 years ago
- Load testing for event analytics platforms (Snowplow, more coming soon)☆13May 17, 2016Updated 9 years ago
- ☆18Sep 7, 2014Updated 11 years ago
- An example setup for integrating the oso policy engine logic within a FastAPI application.☆10Dec 5, 2020Updated 5 years ago
- Spark Custome Stream Source and Sink☆12Jan 19, 2019Updated 7 years ago
- Simple, beautiful data driven tooltip☆13Mar 13, 2022Updated 3 years ago
- JSON processing command line tool based on JSONSelect (CSS-like selectors for JSON)☆43Sep 28, 2015Updated 10 years ago
- Simulation of job offers and CVs with real-time processing, classification, and analytics using Kafka, Ray, Spark, and Databricks. Includ…☆14Dec 25, 2024Updated last year
- Ready to use UI patterns for websites☆16Sep 17, 2015Updated 10 years ago
- ☆11Apr 15, 2019Updated 6 years ago