Scalable machine learning library for Apache Hive/Spark/Pig
☆502Dec 2, 2016Updated 9 years ago
Alternatives and similar repositories for hivemall
Users that are interested in hivemall are comparing it to the libraries listed below
Sorting:
- Mirror of Apache Hivemall (incubating)☆313Sep 6, 2022Updated 3 years ago
- A Hivemall wrapper for Spark☆31Apr 21, 2016Updated 9 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Jan 26, 2016Updated 10 years ago
- A super simple utility for testing Apache Hive scripts locally for non-Java developers.☆73Feb 11, 2017Updated 9 years ago
- ☆13Apr 23, 2016Updated 9 years ago
- Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning☆1,783Aug 16, 2021Updated 4 years ago
- Oozie Samples☆52Jan 11, 2014Updated 12 years ago
- Yet another command-line tool for Datadog☆47Oct 31, 2019Updated 6 years ago
- Documents for Digdag Workflow Engine☆50Aug 27, 2024Updated last year
- a fast strptime engine☆37Sep 26, 2020Updated 5 years ago
- Helpful user defined fuctions / table generating functions for Hive☆101May 2, 2016Updated 9 years ago
- Treasure Data API library for Python☆48Jan 29, 2026Updated last month
- Simplifying robust end-to-end machine learning on Apache Spark.☆475Apr 18, 2017Updated 8 years ago
- R files containing the code used to predict rugby world cup matches☆10Sep 18, 2015Updated 10 years ago
- ☆14Apr 30, 2024Updated last year
- Framework and Library for Distributed Online Machine Learning☆708May 16, 2019Updated 6 years ago
- Zen aims to provide the largest scale and the most efficient machine learning platform on top of Spark, including but not limited to logi…☆170Nov 17, 2018Updated 7 years ago
- Mirror of Apache Lens☆62Nov 5, 2019Updated 6 years ago
- ☆22Jun 10, 2018Updated 7 years ago
- Example of locust cluster for Google Container Engine☆10Aug 18, 2015Updated 10 years ago
- R dplyr connector for ImpalaDB☆15Mar 1, 2017Updated 9 years ago
- Soft Confidence-Weighted Learning in Python☆15Jun 9, 2017Updated 8 years ago
- Livy is an open source REST interface for interacting with Apache Spark from anywhere☆1,007Oct 5, 2022Updated 3 years ago
- PySpark + Scikit-learn = Sparkit-learn☆1,151Dec 31, 2020Updated 5 years ago
- Alenka JDBC is a library for accessing and manipulating data with the open-source GPU database Alenka.☆20Jul 3, 2014Updated 11 years ago
- XBird: Light-weight XQuery processor and XML database system written in Java☆14Oct 8, 2019Updated 6 years ago
- Treasure Boxes - pre-built pieces of code for developing, optimizing, and analyzing your data.☆117Jan 20, 2026Updated last month
- Facebook's Hive UDFs☆277Feb 3, 2026Updated last month
- Deprecated☆337Mar 14, 2017Updated 8 years ago
- Web UI for PrestoDB.☆2,751May 20, 2021Updated 4 years ago
- Large-scale ML & graph analytics on Giraph☆80Jan 29, 2016Updated 10 years ago
- Spark + Jupyer + Hive☆12Sep 24, 2015Updated 10 years ago
- An realtime recommendation system supporting online updates☆17Jul 29, 2025Updated 7 months ago
- Redis search and indexing in Java☆16Sep 26, 2016Updated 9 years ago
- Tranquility helps you send real-time event streams to Druid and handles partitioning, replication, service discovery, and schema rollover…☆13May 3, 2019Updated 6 years ago
- Distributed deep learning on Hadoop and Spark clusters.☆1,263Nov 15, 2019Updated 6 years ago
- Distributed Neural Networks for Spark☆611Jul 23, 2020Updated 5 years ago
- Docker image for Apache Hive running on Tez☆25Apr 24, 2015Updated 10 years ago
- Sparkling Water provides H2O functionality inside Spark cluster☆977Nov 5, 2025Updated 4 months ago