Source code of Blog at
☆51Sep 17, 2025Updated 6 months ago
Alternatives and similar repositories for blog
Users that are interested in blog are comparing it to the libraries listed below
Sorting:
- ☆21Mar 27, 2015Updated 10 years ago
- Query Plan Evaluation☆16Jul 18, 2023Updated 2 years ago
- Data Exploration Using Spark 2.0☆14Apr 17, 2018Updated 7 years ago
- This is an introduction of Apache Spark DataFrames.☆41Mar 12, 2015Updated 11 years ago
- Examples of Spark 2.0☆212Aug 11, 2021Updated 4 years ago
- Examples and explanations of how Akka toolkit works☆21Oct 4, 2022Updated 3 years ago
- This is a sample maven project setup for a mixed scala and java project.☆13Jan 3, 2023Updated 3 years ago
- GPT2 Inference Implementation in Pure C☆31Jun 23, 2025Updated 8 months ago
- Based on wurstmeister's kafka-docker, with Prometheus JMX Exporter included☆12Nov 24, 2016Updated 9 years ago
- Spark Streaming HBase Example☆94Apr 4, 2016Updated 9 years ago
- A simple akka remote example in Scala☆24Jan 5, 2015Updated 11 years ago
- Examples of diagrams using Mermaid: https://mermaid.js.org/intro/☆12Mar 25, 2023Updated 2 years ago
- Graph algorithms implemented in GraphX and Spark styles☆15Apr 26, 2015Updated 10 years ago
- Test for SparkSQL ScalaPB☆14Jun 28, 2022Updated 3 years ago
- ML Workbench☆17Nov 22, 2022Updated 3 years ago
- Factorization Machines for Julia☆11Aug 26, 2016Updated 9 years ago
- Groovy client library for Apache Ambari's REST API☆20Jun 25, 2021Updated 4 years ago
- kvector is a small utility for converting motifs to kmer vectors to compare motifs of different lengths☆10May 29, 2017Updated 8 years ago
- ☆10Oct 25, 2015Updated 10 years ago
- A custom AWS credential provider that allows your Hadoop or Spark application access S3 file system by assuming a role☆10Jan 9, 2026Updated 2 months ago
- ADMM on Apache Spark☆31Jul 21, 2015Updated 10 years ago
- ☆11Apr 15, 2019Updated 6 years ago
- ☆33Apr 23, 2015Updated 10 years ago
- Group a list of objects by a given field name (implemented with ES6 features)☆14Feb 17, 2017Updated 9 years ago
- ☆13Sep 25, 2024Updated last year
- Self-written notes that may be useful☆107Dec 26, 2015Updated 10 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Oct 8, 2025Updated 5 months ago
- ☆61May 12, 2024Updated last year
- abawaca is a binning program for metagenomics☆13May 20, 2017Updated 8 years ago
- Benchmarking suite for Apache Spark☆15Nov 24, 2017Updated 8 years ago
- Additional useful algorithms that can be used with spark.☆24Dec 24, 2014Updated 11 years ago
- ☆10Nov 12, 2023Updated 2 years ago
- An open source stream generator which generates reproducible and deterministic out-of-order streams, simulating arbitrary fractions of ou…☆12May 14, 2019Updated 6 years ago
- A fully incremental model, that transforms raw mobile event data generated by the Snowplow mobile trackers into a series of derived table…☆15May 14, 2024Updated last year
- ☆11Aug 22, 2023Updated 2 years ago
- Utility for benchmarking changes in Spark using TPC-DS workloads☆16Jun 3, 2021Updated 4 years ago
- Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficient…☆54Nov 16, 2022Updated 3 years ago
- Apache Streams☆78Apr 24, 2025Updated 10 months ago
- Examples for extending hive☆90Jan 25, 2018Updated 8 years ago