This is OpenMLDB's Spark Distribution, which is particularly optimized for feature extraction. It includes a few novel techniques, such as native implementation of last join and multi-window parallelization. Its APIs are fully compatible with the standard Spark. It is designed to be a component of OpenMLDB (https://github.com/4paradigm/OpenMLDB)…
☆12Jul 30, 2024Updated last year
Alternatives and similar repositories for spark
Users that are interested in spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ZetaSQL - Analyzer Framework for SQL☆14Oct 23, 2024Updated last year
- A Benchmark for Real-Time Relational Data Feature Extraction (VLDB'23 Best Industry Paper Runnerup)☆55Sep 9, 2023Updated 2 years ago
- Canopy is a machine learning learning compiler stack with the capability of adopting high-end FPGAs. As a part of OpenAIOS project, Canop…☆12May 7, 2021Updated 4 years ago
- Lightweight and Fast Feature Store Powered by Go (and Rust).☆93Feb 28, 2022Updated 4 years ago
- The ChatGPT plugin to enhance OpenMLDB.☆52Apr 6, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 2018年国际AIOps挑战赛KPI时序异常检测比赛基于OpenMLDB部署的工程化部署实践方案☆12Aug 30, 2022Updated 3 years ago
- Pafka is originated from the OpenAIOS project to leverage an optimized tiered storage access strategy to improve overall performance for …☆67Jan 2, 2022Updated 4 years ago
- FeatInsight is a feature platform based on OpenMLDB☆21Mar 7, 2025Updated last year
- An implementation of the persistent skiplist based on Intel Optane Persistent Memory. It is with Intel's pmemkv as an storage engine☆13Apr 9, 2021Updated 5 years ago
- O'Reilly Course, In-Memory Computing Essentials☆10Oct 16, 2020Updated 5 years ago
- https://openjdk.org/projects/jdk-updates last released 2024-07-17☆10Jul 16, 2024Updated last year
- Custom Service for deploying Apache Alluxio on a running HDP 2.3 / IOP 4.1 Ambari Managed Cluster☆13Jan 13, 2017Updated 9 years ago
- A Redis module to provide support for storing Redis native data structures on PMem.☆21Jul 18, 2022Updated 3 years ago
- Livy Manager - Web UI for Managing Apache Livy Sessions☆16Dec 7, 2017Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Benchmark script for comparing different versions of zlib☆15Jul 18, 2017Updated 8 years ago
- Yet another Polyhedra Compiler for DeepLearning☆19Apr 14, 2023Updated 3 years ago
- Core & Community developed monitoring integrations for Sematext monitoring agent☆13May 30, 2024Updated last year
- Training materials and accompanying documentation for "Mastering Transformers: From Building Blocks to Real World Applications" training.☆13Sep 13, 2023Updated 2 years ago
- ☆11Jan 4, 2022Updated 4 years ago
- Simplified custom plugins for Trino☆16Jul 29, 2024Updated last year
- ☆10Aug 30, 2019Updated 6 years ago
- Processing videos on Apache Spark☆12Feb 14, 2022Updated 4 years ago
- A Memory-efficient Graph Store for Interactive Queries☆13Sep 1, 2021Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A library for accelerating data compression using Intel® QAT.☆21Updated this week
- Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.☆15Dec 1, 2018Updated 7 years ago
- A Full RPC Framework Based on Netty.☆14May 19, 2018Updated 7 years ago
- Ambari service for Apache Drill☆17Apr 15, 2016Updated 9 years ago
- Ambari stack for easily installing and managing Redis on HDP cluster☆14Aug 28, 2015Updated 10 years ago
- Spatial queries with Apache Drill☆20Nov 2, 2017Updated 8 years ago
- A benchmark tool for lakehouses.☆14Mar 12, 2023Updated 3 years ago
- Auto-fixing error due to version upgrade, good practice etc.☆11Sep 5, 2020Updated 5 years ago
- ☆16Jul 13, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Automated TPC-DS and TPC-H benchmark for Apache Hive LLAP☆10Jul 18, 2022Updated 3 years ago
- Implement node2vec algorithm using Spark 2 from: http://snap.stanford.edu/node2vec/☆11Jul 10, 2019Updated 6 years ago
- AutoFDO tutorial☆22Jul 5, 2018Updated 7 years ago
- Sematext Monitoring Agent☆24Apr 3, 2026Updated last week
- Ambari stack service for easily installing and managing Solr on HDP cluster☆18Nov 13, 2018Updated 7 years ago
- Mirror of Apache Ranger☆15Apr 5, 2024Updated 2 years ago
- ambari in docker☆16Aug 13, 2023Updated 2 years ago