A light weight, super fast, large scale machine learning library on spark .
☆680Mar 23, 2018Updated 7 years ago
Alternatives and similar repositories for Fregata
Users that are interested in Fregata are comparing it to the libraries listed below
Sorting:
- ☆155Sep 17, 2018Updated 7 years ago
- Zen aims to provide the largest scale and the most efficient machine learning platform on top of Spark, including but not limited to logi…☆170Nov 17, 2018Updated 7 years ago
- Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logi…☆350Jul 6, 2022Updated 3 years ago
- TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.☆3,858Jul 10, 2023Updated 2 years ago
- DistML provide a supplement to mllib to support model-parallel on Spark☆169Feb 6, 2017Updated 9 years ago
- A scalable machine learning library on Apache Spark☆796Aug 30, 2021Updated 4 years ago
- An implement of Factorization Machines (LibFM)☆250Aug 13, 2018Updated 7 years ago
- A Flexible and Powerful Parameter Server for large-scale machine learning☆6,788Oct 13, 2025Updated 4 months ago
- Distributed Neural Networks for Spark☆611Jul 23, 2020Updated 5 years ago
- Stream Data Mining Library for Spark Streaming☆500Apr 16, 2023Updated 2 years ago
- Reactive Factorization Engine☆104Feb 18, 2015Updated 11 years ago
- Glint: High performance scala parameter server☆170Jul 20, 2018Updated 7 years ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆475Apr 18, 2017Updated 8 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Jan 26, 2016Updated 10 years ago
- Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V,…☆8,694Jan 28, 2026Updated 3 weeks ago
- Distributed deep learning on Hadoop and Spark clusters.☆1,263Nov 15, 2019Updated 6 years ago
- sparse word2vec☆108Jul 7, 2022Updated 3 years ago
- Easy Machine Learning is a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real w…☆1,983Dec 18, 2023Updated 2 years ago
- TensorFlow template application for deep learning☆1,879Jul 5, 2023Updated 2 years ago
- AI on Hadoop☆1,732Jul 1, 2025Updated 7 months ago
- Deep Learning Chinese Word Segment☆2,076May 18, 2018Updated 7 years ago
- An attempt of training DNN models to predict ad click-through rate, implemented with Theano.☆408Jun 12, 2017Updated 8 years ago
- A library for time series analysis on Apache Spark☆1,196Oct 13, 2020Updated 5 years ago
- TensorFlow on Spark☆296Oct 19, 2017Updated 8 years ago
- scala、spark使用过程中,各种测试用例以及相关资料整理☆1,085Feb 9, 2019Updated 7 years ago
- 酷玩 Spark: Spark 源代码解析、Spark 类库等☆3,482May 18, 2022Updated 3 years ago
- Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.☆1,845May 29, 2024Updated last year
- Splash Project for parallel stochastic learning☆93Jun 16, 2017Updated 8 years ago
- Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning☆1,786Aug 16, 2021Updated 4 years ago
- Lightweight and Scalable framework that combines mainstream algorithms of Click-Through-Rate prediction based computational DAG, philosop…☆672Jun 17, 2019Updated 6 years ago
- MLeap: Deploy ML Pipelines to Production☆1,535Jan 12, 2026Updated last month
- Stochastic Gradient Boosted Decision Trees as Standalone, TMVAPlugin and Python-Interface☆248Jul 19, 2020Updated 5 years ago
- spark ml 算法原理剖析以及具体的源码实现分析☆1,963Mar 25, 2019Updated 6 years ago
- An implementation of the multi-class/multi-label classifier, of which the training is carried out using AdaBoost.MH on Apache Spark.☆108Oct 21, 2014Updated 11 years ago
- High performance data store solution☆1,446Feb 21, 2026Updated last week
- An open-source columnar data format designed for fast & realtime analytic with big data.☆452Nov 16, 2022Updated 3 years ago
- Distributed Factorization Machines☆299Mar 23, 2016Updated 9 years ago
- Distributed LR、 FM model on Parameter Server. FTRL and SGD Optimization Algorithm.☆224Mar 14, 2018Updated 7 years ago
- Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark☆1,371Aug 22, 2023Updated 2 years ago