apache/incubator-hivemall

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/apache/incubator-hivemall)

apache / incubator-hivemall

Mirror of Apache Hivemall (incubating)

☆313

Alternatives and similar repositories for incubator-hivemall

Users that are interested in incubator-hivemall are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

myui / hivemall
View on GitHub
Scalable machine learning library for Apache Hive/Spark/Pig
☆501Dec 2, 2016Updated 9 years ago
maropu / hivemall-spark
View on GitHub
A Hivemall wrapper for Spark
☆31Apr 21, 2016Updated 10 years ago
brndnmtthws / facebook-hive-udfs
View on GitHub
Facebook's Hive UDFs
☆275Feb 3, 2026Updated 5 months ago
snaga / Hecatoncheir
View on GitHub
Hecatoncheir: The Data Stewardship Studio
☆12Apr 16, 2018Updated 8 years ago
sbt / sbt-nocomma
View on GitHub
sbt-nocomma reduces commas from your build.sbt.
☆12Jun 17, 2026Updated last month
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
treasure-data / treasure-boxes
View on GitHub
Treasure Boxes - pre-built pieces of code for developing, optimizing, and analyzing your data.
☆119Jun 29, 2026Updated 3 weeks ago
apache / gobblin
View on GitHub
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, orga…
☆2,270Jun 24, 2026Updated last month
apache / carbondata
View on GitHub
High performance data store solution
☆1,448Jul 4, 2026Updated 3 weeks ago
linkedin / dr-elephant
View on GitHub
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
☆1,370Aug 22, 2023Updated 2 years ago
aaronshan / hive-third-functions
View on GitHub
Some useful custom hive udf functions, especial array, json, math, string functions.
☆228Jul 30, 2024Updated last year
palantir / spark-tpcds-benchmark
View on GitHub
Utility for benchmarking changes in Spark using TPC-DS workloads
☆16Jun 3, 2021Updated 5 years ago
airbnb / reair
View on GitHub
ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.
☆282Feb 27, 2019Updated 7 years ago
myui / digdag-plugin-example
View on GitHub
An example of Digdag plugin
☆15Mar 29, 2019Updated 7 years ago
HiveRunner / HiveRunner
View on GitHub
An Open Source unit test framework for Hive queries based on JUnit 4 and 5
☆262Jan 6, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 3 weeks ago
treasure-data / pandas-td
View on GitHub
Interactive data analysis with Pandas and Treasure Data.
☆38Mar 25, 2020Updated 6 years ago
xerial / presto-metrics
View on GitHub
Presto metric collection library for Ruby
☆26Jul 1, 2026Updated 3 weeks ago
TIBCOSoftware / snappydata
View on GitHub
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…
☆1,032Nov 21, 2022Updated 3 years ago
komamitsu / td-fdw
View on GitHub
Multicorn based PostgreSQL Foreign Data Wrapper for Treasure Data
☆12Jan 1, 2017Updated 9 years ago
t3rmin4t0r / lipwig
View on GitHub
A slightly moist lipstick-on-pig clone for Apache Hive
☆23Sep 22, 2023Updated 2 years ago
amplab / drizzle-spark
View on GitHub
Drizzle integration with Apache Spark
☆120Sep 11, 2018Updated 7 years ago
apache / bahir
View on GitHub
Mirror of Apache Bahir
☆336Jul 7, 2023Updated 3 years ago
apache / accumulo-fluo
View on GitHub
Apache Fluo
☆201Jul 10, 2026Updated 2 weeks ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
TalkingData / Fregata
View on GitHub
A light weight, super fast, large scale machine learning library on spark .
☆676Mar 23, 2018Updated 8 years ago
xerial / jnuma
View on GitHub
A Java library for accessing NUMA (Non Uniform Memory Access) API
☆16Mar 13, 2013Updated 13 years ago
apache / hudi
View on GitHub
Upserts, Deletes And Incremental Processing on Big Data.
☆6,194Updated this week
hortonworks-spark / spark-llap
View on GitHub
☆102Mar 23, 2020Updated 6 years ago
spark-jobserver / spark-jobserver
View on GitHub
REST job server for Apache Spark
☆2,837Mar 3, 2026Updated 4 months ago
embulk / embulk
View on GitHub
Embulk: Pluggable Bulk Data Loader.
☆1,783Jun 19, 2026Updated last month
qubole / sparklens
View on GitHub
Qubole Sparklens tool for performance tuning Apache Spark
☆592Jun 26, 2024Updated 2 years ago
jeromebanks / brickhouse
View on GitHub
Hive UDF's for the data warehouse
☆19May 7, 2018Updated 8 years ago
bmc / spark-hive-udf
View on GitHub
Example project showing how to use Hive UDFs in Apache Spark
☆55Apr 23, 2019Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
apache / eagle
View on GitHub
Mirror of Apache Eagle
☆411Aug 22, 2020Updated 5 years ago
apache / kylin
View on GitHub
Apache Kylin
☆3,770Jul 16, 2026Updated last week
edwardcapriolo / hive-geoip
View on GitHub
GeoIP Functions for hive
☆49Oct 13, 2020Updated 5 years ago
yahoo / TensorFlowOnSpark
View on GitHub
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
☆3,846Jul 10, 2023Updated 3 years ago
dbis-ilm / pipefabric
View on GitHub
Stream processing engine
☆13Apr 7, 2021Updated 5 years ago
crackcell / mlfeature
View on GitHub
Feature engineering toolkit for Spark MLlib.
☆12Apr 1, 2017Updated 9 years ago
apache / griffin
View on GitHub
Mirror of Apache griffin
☆1,173Aug 3, 2025Updated 11 months ago