Miscellaneous functionality for manipulating Apache Spark RDDs.
☆22Dec 29, 2018Updated 7 years ago
Alternatives and similar repositories for magic-rdds
Users that are interested in magic-rdds are comparing it to the libraries listed below
Sorting:
- low-level helpers for Apache Spark libraries and tests☆16Dec 29, 2018Updated 7 years ago
- SparkListener that converts SparkListenerEvents to JSON and forwards them to an external service via RPC.☆17Apr 6, 2021Updated 4 years ago
- TileDB integrations for machine learning data and model i/o (PyTorch, TensorFlow, Scikit-Learn)☆25Dec 4, 2025Updated 2 months ago
- Load genomic BAM files using Apache Spark☆21Jun 17, 2018Updated 7 years ago
- A sink to save Spark Structured Streaming DataFrame into Hive table☆23May 7, 2018Updated 7 years ago
- An example of building kubernetes operator (Flink) using Abstract operator's framework☆26Jul 12, 2019Updated 6 years ago
- ☆23Oct 8, 2018Updated 7 years ago
- Bioinformatics Ketrew Pipelines☆28Nov 14, 2021Updated 4 years ago
- Personalized cancer epitope discovery and peptide vaccine prediction pipeline☆30Nov 14, 2017Updated 8 years ago
- This is a http metrics reporter for kafka using Jetty with the Codahale metrics servlets (http://metrics.codahale.com/manual/servlets/kaf…☆37Jul 25, 2017Updated 8 years ago
- Simple chatbot created using Rasa☆10Feb 20, 2021Updated 5 years ago
- BigBWA is a new tool that uses the Big Data technology Hadoop to boost the performance of the Burrows–Wheeler aligner (BWA).☆31Jul 12, 2022Updated 3 years ago
- phData Pulse application log aggregation and monitoring☆13Apr 13, 2020Updated 5 years ago
- Incremental 3D Delaunay Tetrahedralization☆14Jul 21, 2024Updated last year
- HBase tailored but otherwise generic JMXToolkit.☆28Jul 6, 2016Updated 9 years ago
- 《智能投顾》读书笔记☆12May 23, 2019Updated 6 years ago
- ElasticSearch settings scheduler☆35Aug 6, 2016Updated 9 years ago
- Package provides java implementation of the latent dirichlet allocation (LDA) for topic modelling☆10May 18, 2017Updated 8 years ago
- ☆10Mar 29, 2022Updated 3 years ago
- Secondary sort and streaming reduce for Apache Spark☆78Jul 3, 2023Updated 2 years ago
- Chrome extension to show CSV diffs on GitHub☆10May 16, 2020Updated 5 years ago
- Port of tasbot to linux☆10Apr 25, 2013Updated 12 years ago
- Classic Books (Including Statistics, Data Science etc.)☆13Dec 26, 2019Updated 6 years ago
- Scripts to build a Docker image with Apache Impala.☆10Aug 9, 2019Updated 6 years ago
- Spark Custome Stream Source and Sink☆12Jan 19, 2019Updated 7 years ago
- ☆18Sep 7, 2014Updated 11 years ago
- Apache NiFi Custom Processor for working with Stanford CoreNLP for Sentiment Analysis in Java 8☆11May 23, 2018Updated 7 years ago
- Java DSL for SQL☆10Aug 5, 2015Updated 10 years ago
- Ticketed lock synchronization primitive☆11Jan 25, 2021Updated 5 years ago
- Load testing for event analytics platforms (Snowplow, more coming soon)☆13May 17, 2016Updated 9 years ago
- Code and architecture diagrams for performance testing a few API approaches on AWS☆10Apr 20, 2019Updated 6 years ago
- This repo is a curated list of places I consider for weekends in Athens with my kid.☆11Dec 19, 2021Updated 4 years ago
- Mantella spell mod for Skyrim VR / AE / SE☆16Dec 15, 2025Updated 2 months ago
- A minimal Apache Hive server in a Docker image☆13Dec 24, 2020Updated 5 years ago
- Meet Rustacean GPT, an experimental project transforming OpenAi's GPT into a helpful, autonomous software engineer to support senior deve…☆14May 10, 2023Updated 2 years ago
- Hadoop InputFormat for http://druid.io/☆10Oct 26, 2016Updated 9 years ago
- Bringing up Docker Compose environments for system, integration and performance testing, with support for ScalaTest and Gatling☆11Jul 29, 2021Updated 4 years ago
- Plutus for the masses☆11Jan 20, 2023Updated 3 years ago
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Aug 17, 2015Updated 10 years ago