Mirror of Apache DataFu
☆121May 20, 2025Updated 10 months ago
Alternatives and similar repositories for datafu
Users that are interested in datafu are comparing it to the libraries listed below
Sorting:
- Hadoop library for large-scale data processing, now an Apache Incubator project☆581Jul 8, 2014Updated 11 years ago
- Mirror of Apache Pig☆689Sep 15, 2025Updated 6 months ago
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆95May 9, 2025Updated 10 months ago
- Home Page and documentation for the JMockit open source project☆12Feb 29, 2020Updated 6 years ago
- A library for calculating clinical quality measures☆28Jul 22, 2022Updated 3 years ago
- Piglet is a DSL for writing Pig scripts in Ruby☆83Jul 21, 2010Updated 15 years ago
- Example files used in the DuckDB - Unity Catalog blog☆10Dec 6, 2024Updated last year
- A Python PySpark Projet with Poetry☆27Feb 17, 2026Updated last month
- A functional wrapper around Spark to make it works with ZIO☆53Updated this week
- This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark w…☆16Oct 3, 2025Updated 5 months ago
- Source code for the website geminibyexample.com which provides simple Python code examples for the Gemini SDK☆22Apr 8, 2025Updated 11 months ago
- Practical FP in Scala book by Gabriel Volpe. Implementation with my view☆17Aug 17, 2024Updated last year
- ACID Data Source for Apache Spark based on Hive ACID☆96Jul 7, 2021Updated 4 years ago
- Integrate the GA4GH schemas and probably a scala impl of the service.☆14May 20, 2016Updated 9 years ago
- 华为HBase普通客户端和安全模式客户端,包括建表、建索引、异步请求、Put、Get、Scan等功能☆11Jun 6, 2020Updated 5 years ago
- Mirror of Apache Hama☆132Feb 11, 2020Updated 6 years ago
- Kite SDK☆393Nov 1, 2022Updated 3 years ago
- ☆10Jun 29, 2021Updated 4 years ago
- GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs☆1,144Updated this week
- Apache Kafka Overview☆12Jun 9, 2023Updated 2 years ago
- Mirror of Apache Crunch (Incubating)☆109Feb 2, 2021Updated 5 years ago
- Write Web API clients using annotations in python☆16Mar 7, 2026Updated 2 weeks ago
- On the fly, translation of Spark programs to run natively on your Oracle DB. Your Spark programs require no changes.☆35Apr 15, 2025Updated 11 months ago
- The Apache Gora open source framework provides an in-memory data model and persistence for big data.☆122Feb 23, 2024Updated 2 years ago
- Druid indexing plugin for using Spark in batch jobs☆101Oct 21, 2021Updated 4 years ago
- A curated list of awesome Apache Spark packages and resources.☆1,866Feb 27, 2026Updated 3 weeks ago
- Fortanix Baklava Design System☆15Updated this week
- Filling in the Spark function gaps across APIs☆50Apr 14, 2021Updated 4 years ago
- Base classes to use when writing tests with Spark☆1,549Dec 22, 2025Updated 3 months ago
- Discover Flink clusters on Hadoop YARN for Prometheus☆23Aug 5, 2020Updated 5 years ago
- Example using Grafana with Druid☆11Mar 27, 2015Updated 10 years ago
- SMVs: Enforcing Least Privilege Memory Views for Multithreaded Applications☆13Jul 7, 2022Updated 3 years ago
- Pig Visualization framework☆465Mar 24, 2023Updated 2 years ago
- Apache Fluo Muchos☆26Dec 6, 2024Updated last year
- Spring Boot CRUD Rest APIs with Spring Data Cassandra☆15Apr 30, 2021Updated 4 years ago
- A library for creating wrappers around command-line programs.☆32Oct 6, 2022Updated 3 years ago
- Ruby implementation of Boldyreva's order-preserving encryption scheme☆13May 13, 2025Updated 10 months ago
- Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!☆235Jan 24, 2025Updated last year
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆187Oct 15, 2025Updated 5 months ago